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DESCRIPTION 

CLONED PORPHYROMONAS GINGIVAUS GENES 
AND PROBES F OR THE DETECTION OF PERIODONTAL DISEASE 

5 

The subject invention was made with government support under a research project supported 
by the National Institutes of Health Grant Nos. DE 07496 and DE 00336. The government has 
certain rights in this invention. 

10 Cross-Refer en ce to a Related Application 

This is a continuation-in-part of co-pending application Serial No. 08/353,485, filed 
December 9, 1994, which is a continuation-in-part of application Serial No. 07/647,119, filed 
January 25, 1991; which is a continuation-in-part of application Serial No. 07/241,640, filed 
September 8, 1988, now abandoned 

15 

Background of the Invention 
Periodontal disease (PD) is a chronic inflammatory disease which results in the destruction 
of the supporting tissues of teeth. Although the specific microbial etiology of PD is not known, it 
is widely accepted that bacteria are the contributing agents of the disease. 

20 The presence of a complex microflora in the subgingival crevice has complicated the 

identification of the specific etiologic agents of PD. However, it appears that a few genera, primarily 
gram-negative anaerobes, are associated with disease progression. Several lines of evidence strongly 
implicate the gram-negative anaerobic bacterium Porphyromonas gingivalis, previously known to 
those skilled in the art as Bacteroides gingivalis, as an etiological agent of adult periodontal disease 

25 (White, D., D. Mayrand [1981] "Association of Oral Bacteroides with Gingivitis and Adult 

Periodontitis," J. Periodont. Res. 1:1-18; Takazoe, L., T. Nakamura, K. Okuda [1984] 
"Colonization of the Subgingival Area by Bacteroides gingivalis" J. Dent. Res. 63 :422-426. For 
example, relatively high proportions of P. gingivalis have been isolated from adult periodontitis 
lesions, patients with adult periodontitis have been found to have higher levels of IgG antibodies to 

30 P. gingivalis than do normal adults, and local immunity to P. gingivalis is greater in the more 

advanced cases than in the early forms of periodontal disease. P. gingivalis also appears to be a 
causative agent of experimental periodontitis in animals (Slots, J., E. Hausxnann [1979] 
"Longitudinal Study of Experimentally Induced Periodontal Disease in Macaca arctoides: 
Relationship Between Microflora and Alveolar Bone Loss," Infect. Immun. 23:260-269). In 
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addition, P. gingivalis possesses a variety of suspected virulence factors such as proteases, 
collagenases, immunoglobulin degrading enzymes, and adhesins. 

In order to exert their pathogenic effects, periodontopathic bacteria such as P. gingivalis 
must possess characteristics which enable them to colonize the host, survive in the periodontal 
pocket, possibly invade the gingival tissues, and to destroy the collagenous periodontal ligament, the 
alveolar bone, and other tissue components surrounding the tooth. Components of bacteria which 
mediate attac hment to host tissues include surface structures such as fimbriae, capsular materials, 
Iipopolysaccharides, and membrane-associated extracellular vesicles. 

The hcaiagglunnatmg activity of P. gingivalis has been studied as a parameter that affects 
the adherence of this organism in the periodontal pocket Sera from patients with adult periodontitis 
possess high antibody levels to the P. gingivalis hemagglutinin. It is thus suggested that the 
adhesive surface structures such as hemagglutinin participate in P. gingivalis colonization and 
antigenic stimulation of the host. 

Investigations have reported the isolation of hemagglutinin activity from P. gingivalis. 
Boyd and McBride (Boyd, J., B.C. McBride [1984J "Fractionation of Hemagglutinating and 
Bacterial Binding Adhesins of Bacteroides gingivalis," Infect. Immun. 45:403-409) prepared an 
outer membrane component containing hemagglutinating activity from P. gingivalis W12. This 
preparation contained three major proteins with molecular weights of 69,000, 41,500, and 22,000. 
Inoshita et al. (Inoshita, E., A. Amano, T. Hanioka, H. Tamagawa, S. Sluzukushi, A. Tsunemitsu 
[1986] "Isolation and Some Properties of Exohemagglutinin from the Culture Medium of 
Bacteroides gingivalis 381," Infect. Immun. 52:42 1-427) isolated hemagglutinating activity from 
culture supematants of P. gingivalis 381. The isolated hemagglutinin component contains three 
major proteins with molecular weights of 24,000, 37,000, and 44,000. Okuda et al. (Okuda, K. T A. 
Yamanoto, Y. Naito, I. Takazoe, J. Slots, R.J. Genco [1986] "Purification and Properties of 
Herriagglutinin from Culture Supernatant of Bacteroides gingivalis," Infect. Immun. 55:659-665) 
also purified a hemagglutinin ofP. gingivalis 38 1 from culture supernatant which appears to have 
vesicle or tubelike structures and is comprised mainly of a 40,000 molecular-weight protein. Their 
recent report indicated that sera from most patients with adult periodontitis reacts to the 
hemagglutinin antigen at 43,000 and 57,000 molecular weights (Naito, Y., K. Okuda, I. Takazoe 
[1987] "Detection of Specific Antibody in Adult Human Periodontitis Sera to Surface Antigens of 
Bacteroides gingivalis," Infect. Immun. 55(3):832-834). 

Recombinant DNA techniques have proven to be powerful tools for the study of 
pathogenesis. However, recombinant DNA techniques have been applied only sparingly to the study 
of gram-negative anaerobic pathogens and even less to the study of the molecular mechanisms of 
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periodontopathogenesis. The recombinant DNA methodologies offer advantages over previous 
methods used in the study of oral pathogens. For example, the cloning of P. gingivalis antigens 
allows for a genetic and molecular analysis of the gene(s) which presently is difficult due to the lack 
of knowledge about the genetic system in P. gingivalis. 

5 

Brief Summary of the Invention 
Genes have been cloned and the proteins encoded thereby have been isolated from organisms 

associated with periodontal disease (PD). In particular, genes from Porphyromonas gingivalis, 

which is an etiological agent of adult PD have been identified, characterized, and sequenced. These 
0 genes have also been ligated to an appropriate vector and used to transform an appropriate host cell. 

The recombinant cells express antigens which elicit immunological responses. Antigens expressed 

by the P. gingivalis clones are also identified and described here. 

The invention provides, inter alia, a means of detecting the presence of disease-causing P. 

gingivalis. The detection method involves the use of DNA probes and antibody probes which 
5 selectively identify the presence of these bacteria or can be used to identify other organisms, 

including other prokaiyotes or eukaiyotes, which have similar nucleic acid or amino acid sequences. 

Also provided arc polypeptides which can be used for the production of antibodies to the organisms 

associated with PD. The antibodies selectively and specifically bind to the subject proteins and can 

be utilized in purification and identification procedures. These genes and polypeptides can be used 
0 as a vaccine against PD. Further, a means of producing monoclonal antibodies for the antigens 

associated with periodontal disease is also provided. 



Brief Description of the Drawings 

Figure 1 shows a schematic diagram of restriction enzyme recognition sites of recombinant 
plasmids from clones 2, 5, and 7. The solid lines represent P. gingivalis DNA inserts. The hatched 
boxes represent pUC9 regions. 

Figure 2 shows a restriction map of a hemagglutinin gene, hagB. The hemagglutinin gene 
is contained on a HlndLH fragment in pUC9. 

Figure 3 shows a restriction enzyme map of cloned EcoKV fragments of P. gingivalis 381. 
The heavy shaded area designates the originally cloned ST2 fragment; the thin shaded area 
designates the amplified IPCR fragment. 

Figure 4 shows the restriction enzyme map of the prtP gene. The top line represents the 
prtP gene sequence; the bottom line represents the gene product. Restriction sites shown are: B, 
BamHL, N, Nspl; A, AspEL\ S, Sad; X, Xcml. Fragments used as probes for Southern blot analyses 
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are shown as heavy bars below the DNA sequence and in the comparable position below the protein 
sequence. The DNA region homologous to IS 1 126 is underlined. Regions repeated within the 
protein are shown as identical boxes, and the Pro-Asn repeat region is indicated by an asterisk. 
Putative autodegradation cleavage sites and the signal peptide cleavage site are indicated below the 
gene product. 

Brief Description of the Sequences 
SEQ ID NO* 1 is the nucleotide sequence of the hemagglutinin gene designated hagA. 
SEQ ID NO. 2 is the derived amino acid sequence of the polypeptide encoded by the hagA 

gene. 

SEQ ID NO. 3 is the nucleotide sequence of the hemagglutinin gene designated hagB. 
SEQ ID NO. 4 is the derived amino acid sequence of the polypeptide encoded by the hagB 

gene. 

SEQ ID NO. 5 is the nucleotide sequence of the hemagglutinin gene designated hagC. 
SEQ ID NO, 6 is the derived amino acid sequence of the polypeptide encoded by the hagC 

gene. 

SEQ ID NO. 7 is the nucleotide sequence of the hemagglutinin gene designated hagD. 
SEQ ID NO. 8 is the derived amino acid sequence of the polypeptide encoded by the hagD 

gene. 

SEQ ID NO. 9 is the nucleotide sequence of the gene designated prtP. 

SEQ ID NO. 10 is the derived amino acid sequence of the polypeptide encoded by the prtP 

gene. 

SEQ ID NO. 11 is primer APF 147 used according to the subject invention. 

SEQ ID NO. 12 is primer APF 148 used according to the subject invention. 

SEQ ID NO. 13 is the nucleotide sequence for the entire hagA gene obtained from the 
EcoRV fragment of the P. gingivalis strain, according to the subject invention. 

SEQ ID NO. 14 is the deduced amino acid sequence of the polypeptide encoded by the 
entire hagA gene. 

SEQ ID NO. 15 is the nucleotide sequence of the first approximately 1.3 kb repeat 
sequence from hagA, designated HArepL 

SEQ ID NO. 16 is the deduced amino acid sequence of the polypeptide encoded by HArep 1 . 

SEQ ID NO. 17 is the nucleotide sequence of the second approximately 1.3 kb repeat 
sequence from hagA, designated HArepL 

SEQ ID NO. 18 is the deduced amino acid sequence of the polypeptide encoded by HArep2. 
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SEQ ID NO. 19 is the nucleotide sequence of the third approximately 1.3 kb repeat 
sequence from hagA, designated HArep3. 

SEQ ID NO. 20 is the deduced amino acid sequence of the polypeptide encoded by HA rep 3. 
SEQ ID NO. 21 is the nucleotide sequence of the fourth approximately 1.3 kb repeat 
5 sequence from hagA, designated HArep4. 

SEQ ID NO. 22 is the deduced amino acid sequence of the polypeptide encoded by HArep4. 
SEQ ID NO. 23 is a negative primer at 405 nucleotide (t) upstream of the 5 ' end of the ST 
2 fragment used according to the subject invention. 

SEQ ID NO. 24 is a positive primer at 529 nt 3 ' of the ST 2 fragment used according to 
10 the subject invention. 

SEQ ID NO. 25 is the nucleotide sequence of the entire hagD gene. 
SEQ ID NO. 26 is the deduced amino acid sequence of a polypeptide encoded by a first 
open reading frame of the entire hagD gene. 

SEQ ID NO. 27 is the deduced amino acid sequence of a polypeptide encoded by a second 
open reading frame of the entire hagD gene. 

SEQ ID NO. 28 is the nucleotide sequence of the hemagglutinin gene designated hagE. 
SEQ ID NO. 29 is the deduced amino acid sequence of the polypeptide encoded by an open 
reading frame of the hagE gene. 

Detailed Description of the Invention 
The DNA sequences of the present invention comprise structural genes encoding proteins 
which can be involved in the pathogenesis of bacteria responsible for periodontal disease. The genes 
of the subject invention can be isolated from the DNA of Porphyromonas gingivalis. The genes of 
the subject invention are further characterized by determination of their nucleotide sequences. After 
obtaining the DNA, a gene library can be developed and the resulting DNA fragments inserted into 
suitable cloning vectors which are introduced into a compatible host. Depending on the particular 
host used, the vector can include various regulatory and other regions, usually including an origin 
of replication, and one or more promoter regions and markers for the selection of transformants. In 
general, the vectors will provide regulatory signals for expression, amplification, and for a regulated 
response to a variety of conditions and reagents. 

Various markers can be employed for the selection of transformants, including biocide 
resistance, particularly to antibiotics such as ampicillin, tetracycline, trimethoprim, chloramphenicol, 
and penicillin; toxins, such as colicin; and heavy metals, such as mercuric salts. Alternatively, 
complementation providing an essential nutrient to an auxotrophic host can be employed. 
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Hosts which may be employed for the production of the polypeptides of the present 
invention include unicellular microorganisms, such as prokaryotes, i.e., bacteria; and eukaryotes, 
such as fungi, including yeasts, algae, protozoa, molds, and the like. Specific bacteria winch are 
susceptible to transformation include members of the Enterobacteriaceae, such as strains of 
5 Escherichia coli, Salmonella; Bacillaceae, such as Bacillus subhlis; Pneumococcus; Streptococcus; 

Haemophilus influenzae, and yeasts such as Saccharomyces, among others. 

The DNA sequences can be introduced directly into the genome of the host or can first be 
incorporated into a vector which is then introduced into the host Exemplary methods of direct 
incorporation include transduction by recombinant phage or cosmids, transfection where specially 

10 treated host bacterial cells can be caused to take up naked phage chromosomes, and transformation 

by calcium precipitation. These methods are well known in the art. Exemplary vectors include 
plasmids, cosmids, and phages. 

Genomic libraries of P. gingivalis DNA were constructed in known plasmid expression 
vectors. For example, the plasmid expression vector, pUC9, contains the pBR 322 origin of 

15 replication, the pBR 322 ampicillin resistance gene, and a portion of the lac Z gene of £. coli which 

codes for the a -peptide of P-galactosidase. The amino terminus of the lac Z gene contains a 
polylinker region which has multiple unique cloning sites. Transformation of E. coli JM 109, which 
is defective in p-galactosidase, with pUC9 complements the bacterial p-galactosidase activity, 
resulting in the ability of the bacterial cell to metabolize the lactose analog X-GAL to a blue color. 

20 Cloned DNA inserted in the polylinker region interrupts the lac Z gene of the plasmid Therefore 

E. coli transformants resulting from recombinant plasmids are unable to metabolize X-GAL and 
appear as white colonies on X-GAL containing plates. 

E coli clones were isolated which stably exhibited P. gingivalis antigen expression. These 
antigens were detected in intact cells both by filter-binding enzyme immunoassay and ELIS A. One 

25 of these clones, clone 2, was found to encode a polypeptide with an average molecular weight of 

greater than 125 kD, seen in polyacrylamide gels and detected by Western blot analysis. This 
polypeptide was later determined to be greater than 144 kD. The entire hagA gene which was 
originally identified from clone 2 is now determined to encode a 283.3 kD protein. Expression of 
the P. gingivalis antigen in clone 2 occurs either in the presence or absence of IPTG but is enhanced 

30 by IPTG stimulation. The expression of the clone 3 antigen was also found to be stimulated by 

IPTG in the same manner as clone 2. 

When antigen-expressing clones were surveyed for functional activities, clones 2, 5, and 7 
were able to agglutinate erythrocytes whereas £. coli JM109 (pUC9) was not. The restriction maps 
and Southern blot hybridization of these clones indicated that clone 2 cells contain a Porphyromonas 
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DNA insert different from clones 5 and 7. Clone 5, which is also able to autoagglutinate, has a 760 
bp DNA fragment in addition to a 4,800 bp fragment in common with the clone 7 insert. Subcloning 
of these two fragments in different orientations revealed that the 4,800 bp DNA encoded for the 
hexnagglutinating activity and the 760 bp DNA for the autoagglutinating activity. Both fragments 
5 must contain a Porphyromonas promoter since the subclones with opposite orientations of the 

inserts still express functional proteins, indicating that antigen expression of clones 5 and 7 is not 
stimulated by IPTG. 

Western blot analysis of clones 5 and 7 and mini cell analysis of the subclones further 
revealed that the P. gingivalis DNA fragment encoded polypeptides of approximately 16 kD and 

10 approximately 49-50 kD. These polypeptides were sized using SDS-PAGE, under denaturing 

conditions. A native 49-50 kD protein was also purified by immunoaffinity chromatography. No 
other purified 49-50 kD protein associated with hemagglutination has been reported Therefore, the 
49-50 kD protein is a previously undetected surface antigen involved in hemagglutination. 

£ coli adsorbed rabbit-polyclonal antibody against clone 2 was found to react with several 

15 bands in the P. gingivalis cell lysate preparation separated by SDS-PAGE. The most rapidly 

developing and strongest reaction appeared at two bands of 43 kD and 38 kD. Two bands of 32 kD 
and 30 kD appeared later and three faint bands of 1 1 0 kD, 90 kD and 75 kD sometimes were visible 
still later. This strongly suggests that the P. gingivalis hemagglutinin is expressed in clone 2. 

E. coli adsorbed rabbit-polyclonal antibody against clones 5 and 7 also reacted with two 

20 bands of 43 kD and 38 kD, but barely reacted with the higher bands of 1 10 kD, 90 kD, and 75 kD, 

and did not react with the bands of 32 kD and 30 kD. Thus, clones 5 and 7 contain DNA inserts 
which are nonhomologous with clone 2 and express different antigenic epitopes, but all function as 
hemagglutinin. The clone 7 insert contains a Porphyromonas promoter but the clone 2 insert does 
not. An £ coli host (clone 2) has been designated £ coli pST 2 and deposited with the American 

25 Type Culture Collection (ATCC), 12301 Parklawn Drive, Rockville, MD 20852. Also, an £ coli 

host (clone 5) has been designated £ coli pST 5 and it, too, has been deposited with the ATCC. 
These deposits were assigned the following accession numbers: 

Culture Accession number Deposit date 

30 £ coli pST 2 ATCC 67733 June 24, 1988 

£ coli pST 5 ATCC 67734 June 24, 1988 

The subject cultures have been deposited under conditions that assure access to the cultures 
will be available during the pendency of this patent application to one determined by the 
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Commissioner of Patents and Trademarks to be entitled thereto under 37 CFR 1.14 and 35 U.S.C. 
122. The deposits are available as required by foreign patent laws in countries wherein counterparts 
of the subject application, or its progeny, are filed. However, it should be understood that the 
availability of a deposit does not constitute a license to practice the subject invention in derogation 
5 of patent rights granted by governmental action. Further, the subject culture deposits will be stored 

and made available to the public in accord with the provisions of the Budapest Treaty for the Deposit 
of Microorganisms, i.e., they will be stored with all the care necessary to keep them viable and 
uncontaminated for a period of at least five years after the most recent request for the furnishing of 
a sample of a deposit, and in any case, for a period of at least 30 (thirty) years after the date of 

10 deposit or for the enforceable life of any patent which may issue disclosing the cultures. The 

depositor acknowledges the duty to replace a deposit should the depository be unable to furnish a 
sample when requested. All restrictions on the availability to the public of the subject culture 
deposits will be irrevocably removed upon the granting of a patent disclosing them. 

The novel genes disclosed and claimed herein can be probed out of the K coli strains which 

15 have been deposited with the ATCC The isolation of these genes can be performed using techniques 

which are well-known in the molecular biology art. The isolated genes can be inserted into 
appropriate vehicles which can then be used to transform another microbe. 

It is well understood in the field of biotechnology that the subject genes and gene products 
have many valuable uses. For example, the genes themselves, and fragments thereof, which 

20 comprise particular nucleic acid sequences can be used to specifically and selectively hybridize to, 

or probe, other nucleic acid sequences to determine the presence of homologous sequences therein. 
This use of the subject nucleotide sequences, or fragments thereof, as probes can have advantageous 
applications in their use as a diagnostic tool, identifying organisms or other transformants that have 
nucleic acid sequences which are sufficiently homologous such that, using standard procedures and 

25 conditions, hybridization can occur between the test sequences and the probe. As used herein, 

substantial sequence homology refers to homology which is sufficient to enable the variant to 
function in the same capacity as the original probe. Preferably, this homology is greater than 50%; 
more preferably, this homology is greater than 75%; and most preferably, this homology is greater 
than 90%. The degree of homology needed for the variant to function in its intended capacity will 

30 depend upon the intended use of the sequence. It is well within the skill of a person trained in this 

art to make mutational, insertional, and deletional mutations which are designated to improve the 
function of the sequence or otherwise provide a methodological advantage. 

In addition, the subject nucleotide and fragments thereof can be sequences useful as primers 
in the preparation and manufacture of sequences by polymerase chain reaction (PCR), inverse 
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polymerase chain reaction (IPCR), or other nucleic acid synthesis methods. Obviously, the subject 
genes and fragments can be useful for the production of the gene product, i.e., the antigen or 
polypeptides encoded thereby. 

Mutations, insertions, and deletions can be produced in a given polynucleotide sequence in 
5 many ways, and these methods are known to the ordinary skilled artisan. Other methods may be 

come known in the future. 

The known methods include, but are not limited to: 

(1) synthesizing chemically or otherwise an artificial sequence which is a mutation, 
insertion or deletion of the known sequence; 
10 (2) using a probe of the present invention to obtain via hybridization a new sequence 

or a mutation, insertion or deletion of the probe sequence; and 
(3) mutating, inserting or deleting a test sequence in vitro or in vivo. 
It is important to note that the mutational, insertional, and deletional variants generated from 
a given probe may be more or less efficient than the original probe. Notwithstanding such 
IS differences in efficiency, these variants are within the scope of the present invention. Thus, 

mutational, insertional, and deletional variants of the disclosed sequences can be readily prepared 
by methods which are well known to those skilled in the art. These variants can be used in the same 
manner as the instant probes so long as the variants have substantial sequence homology with the 
probes. 

20 The gene products can also have a variety of uses. For example, the antigens so produced 

by a gene in a transformed host can be useful in the production of antibodies to the antigen. Those 
antibodies can be used as probes, when labeled, or can be used in affinity separation techniques. 
These polypeptides can also be useful as molecular weight markers in chromatographic or 
electrophoretic procedures, or the like, where molecular weights are used to characterize an unknown 

25 polypeptide or identify or confirm the existence of a known polypeptide. 

Following are examples which illustrate materials, methods and procedures, including the 
best mode, for practicing the invention. These examples are illustrative and should not be construed 
as limiting. 

30 

Example 1 - Preparation of chromosomal DNA 

Porphyromonas gingivalis 381 obtained from a stock culture was grown on plates 
containing Trypticase soy agar (MBL Microbiology Systems, Cockeysville, MD) supplemented with 
sheep blood (5%), hemin (5 tig/ml), and menadione (5 ng/ml). The organism was also grown in 10 
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ml of Todd-Hewitt broth (BBL) supplemented with hemin (5 ug/ml), menadione (5 ug/ml) and 
glucose (2 mg/ml). Cultures were incubated in an anaerobic chamber in a N 2 -H 2 -C0 2 (85: 1 0:5) 
atmosphere at 37°C until the log phase of growth was obtained. The 10 ml broth culture was 
transferred into 25 ml of the same medium and subsequently transferred to 500 ml of medium. 
Incubation was at 37°C anaerobically until a late log phase culture was obtained. E. coli JM109 
[rec Kl,endA\,gyr A96, thi,hs<JKl7 sup E44, re! Al, (lac-pro AN), (F;iraD36 >jP «>AB, lac 12 
M15)] and the plasmid expression vector pUC9 have been described previously (Vicra, J., J 
Messing [1982] "The pUC Plasmids, an M13 mp 7-Derived System for Insertion Mutagenesis and 
Sequencing with Synthetic Universal Primers," Gene 19:259-268). E. coli JM109 was cultured in 
Luria-Bertani (LB) medium consisting of Bactc-tryptone (10 g/1), Bacto-yeast extract (5 g/1), and 
NaCI (5 g/I). For soUd media, Bacto-agar was added at a final concentration of 15 g/1. E coli 
JM109 informants were selected and maintained on LB plates containing 50 ug of ampicillin/ml. 

Next, chromosomal DNA from P. gingival 381 was prepared as follows: One to three 
liters of cells were pelleted by centrifugation and washed once with lx SSC buffer (0.87% NaCI, 
0.04% sodium citrate) containing 27% sucrose and 10 mM cthylenediarninetetraacetic acid (EDTA). 
The cells were pelleted and resuspended in 1/50 of the original volume of the same buffer at 4°C. 
Lysozyme (5 mg/ml) in SSC was added to 0.5 mg/ml; the mixture was mixed thoroughly and 
incubated at 37°Cfor 10 rninutes. Nine volumes of 1% SSC containing 27% sucrose 10 mM EDTA 
and 1.11% SDS (prewarmed to 39'C) were added and the cell suspension was incubated at 37°C 
for 10 to 30 rninutes until cell lysis was complete. In order to denature any contaminating proteins, 
proteinase K was added to a final concentration of 1 mg/ml and the lysate was incubated at 37 °C for 
4 hours. DNA was extracted twice with phenol, twice with phenol-chloroform (1:1 by volume), and 
four times with chloroform Two volumes of absolute alcohol were added and the precipitated DNA 
was spooled onto a glass rod The purified DNA was rinsed with 70% ethanol and suspended in TE 
25 buffer, pH 8.0 ( 1 0 mM Tris-HCl pH 8.0, 1 mM EDTA). 

Alterative* choral by a method of CTAB 

(hexadecyltrimethyl ammonium bromideyCsCl ultracentrifugation. Briefly, 0.4-0.5 g wet cells was 
resuspended in 9.5 ml TE buffer (10 mM Tris/Cl, pH 8.0, 1 mM EDTA, pH 8.0), and then 0.5 ml 
of 10% SDS, and 50 ul of 20 mg/ml proteinase K were added and incubated for 1 hour at 37*C. 
Then 1.8 ml of 5 M NaCI and 1.5 ml CTAB/NaCl were added and incubated 20 minutes at 65»C. 
The mixture was extracted with Chloroform/isoamyl alcohol and precipitated with 0.6 volume 
isopropanoL DNA pellet was dissolved in 20 ml TE buffer and 20 g CsCl and 500 ul of 10 mg/ml 
ethidium bromide were added and centrifuged 30 minutes at 12,000 rpm using a Beckman GA-20 
rotor. The supernatant was then centrifuged in a Beckman VTi50 rotor for 1 8 hours at 45,000 rpm. 
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DNA band was collected under long wave UV lamp and cthidium bromide was removed by water 
saturated butanol extraction and dialyzed against TE buffer thoroughly to remove CsCl. 

Chromosomal DNA from the P. gingivalis strain W 1 2 can be obtained by similar methods. 

5 Example 2 - Isolation of Plasmid DNA and Construction of Genomic Libraries 

Plasmid DNA was isolated by the method of Ish-Horowicz and Burke (Ish-Horowicz, D., 
J.F. Burke [1981] "Rapid and Efficient Cosmid Cloning " Nucleic Acids Res. 9:2989-2998) in 
which cells were lysed with SDS-EDTA in the presence of NaOH. Potassium acetate, pH 4.8, was 
added at 4°C and cell debris, protein, RNA, and chromosomal DNA were removed by centrifiigation. 

10 The plasmid was precipitated with two volumes of ethanol, washed with 70% ethanol, dried, and 

resuspended in TE buffer at pH 7.5. The plasmid was separated from contaminating RNA and any 
remaining chromosomal DNA by cesium chloride density centrifiigation in the presence of ethidium 
bromide. Ethidium bromide and cesium chloride were removed by butanol extraction and dialysis, 
respectively. The dialyzed plasmid was then phenol-chloroform extracted, ethanol precipitated, and 

15 resuspended in TE buffer. 

Purified P. gingivalis DNA was then partially digested with Sau3A restriction endonuclease 
to create fragments of 2-10 kilobases which were ligated to the dephosphorylated BamKL site of 
vector pUC9 with T 4 DNA ligase by standard methods (Maniatis, T., E.F. Fritsch, J. Sambrook 
[1982] Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring 

20 Harbor, NY; Sambrook, J., E.F. Fritsch, T. Maniatis [1989] Molecular Cloning: A Laboratory 

Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY; and Wizard Mini-Prep Kit, 
Promega Co., Madison, WI). Genomic fragments were also obtained by partial digestion of the 
chromosomal DNA with HindSU restriction endonuclease and ligated to the dephosphorylated 
HindUl site of pUC9. The recombinant plasmids were used to transform E. coli JM 1 09. £. coli 

25 JM1 09 was grown to an early log phase (OD 350 = 0.2) in LB broth. Ten ml of the culture were 

cenirifuged at 5,000 rpm, for 5 minutes at 4°C and resuspended in 2 ml of transformation buffer 1 
(TFM 1, 10 mM Tns-HCl, pH 7.5, 0.15 M NaCl). The cells were then pelleted and resuspended in 
2 ml of TFM 2 (50 mM CaCl 2 ) and incubated on ice for 45 minutes. The cells were again pelleted 
and gently resuspended in 3 ml of TFM 2, and dispensed into 0.2 ml aliquots. One-tenth ml of TFM 

30 3 (10 mM Tris-HCl, pH 7.5, 50 mM CaCl* 10 mM MgSOJ was added to each aliquot followed by 

varying amounts of DNA. The cells were then allowed to incubate on ice for 45 minutes, and heat 
shocked at 37°C for 2 minutes. LB broth (0.5 ml) was added and the cell suspension was incubated 
at 37°C for 1 hour. Finally, the cells were plated on LB agar containing ampicillin (50 yg/ml) and 
5-bromo-4^hloro-3-indolyl- P -D-galactopyranoside (X-GAL) (200 ^ig/ml) and incubated for 24 to 
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48 hours at 37°C All transformants were stored at -70°C in LB broth with ampicillin (50 ^g/ml) 
and 20% glycerol. 

Example 3 - Preparation of Antisera and Assay of Antibody Titer 

5 Late exponential phase cells of P. gingivalis strain 38 1 were pelleted, washed with 0.01 M 

phosphate-buffered saline (PBS) pH 7.2, and resuspended in PBS and 0.01 sodium azide at 4°C for 
at least 1 hour. The cells were again washed with PBS, resuspended to a concentration of 1 x 10 9 
cells/ml and emulsified in an equal volume of Freund's incomplete adjuvant. The cell emulsion was 
injected in 3 doses at two week intervals for 4 weeks subcutaneously in the back of adult New 

1 0 Zealand rabbits. Each rabbit was given a booster dose 50 to 60 days later. Antisera were collected 

from the marginal ear veins just prior to immunization and beginning one week after the booster 
dose. All sera were stored at -20°C. 

Rabbit anti-P. gingivalis antiserum was adsorbed 4 times with E coli JM109 harboring 
pUC9 plasmid [E coli JM109 (pUC9)]. For each adsorption, E. coli cells from 1 liter of a 

IS stationary phase culture were washed and mixed with 3 ml of serum at 4°C for 1 hour. The serum 

was recovered by pelleting the cells at 5,000 xg for 20 minutes. For sonicate adsorption, E coli cells 
from 500 ml of stationary phase growth suspended in 5 ml PBS were disrupted by sonication and 
mixed with E coli cell-adsorbed serum for 1 hour at 4 °C. The mixture was centrifuged at 1 00,000 
xg for 1 hour and the resulting clear serum was stored at -20°C. 

20 Sera were then tested for anti-P. gingivalis and anti-E coli activities by an enzyme-linked 

immunosorbent assay (EL1SA). P. gingivalis cells suspended in carbonate-bicarbonate buffer, pH 
9.6 ( 1 0 8 cells per well) were fixed to microtiter plates at 4 °C overnight. After the wells were washed 
with 0.5% "TWEEN-20" in PBS, 1% bovine serum albumin (BSA) in PBS was added to each well, 
and the plates were incubated for 2 hours at room temperature in order to saturate the binding sites. 

25 After washing the plates, serially diluted antiserum was added and plates were incubated for 1 hour 

at room temperature followed by a second wash with 0.5% "TWEEN-20" in PBS. Peroxidase 
conjugated goat anti-rabbit IgG, diluted 1:1000 in 1% BSA, was added and the plates were again 
incubated at room temperature for 1 hour. After a final washing, a color-forming substrate solution 
(0-phenylenediamine, 0.5 g/100 ml in 0. 1 M citrate buffer, pH 4.5, and 1 .8% hydrogen peroxide) 

30 was added, and the plates were incubated for 30 minutes at room temperature. The absorbance at 

492 nm was measured with a Titertek Multiscan reader. An absorbance of 0.05 or more over 
background was considered positive. Background readings were obtained from the wells in which 
all reagents except anti-P. gingivalis antiserum was added. Normal rabbit serum was also tested 
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against P. gingivalis antigen. To test the effectiveness of adsorption, the titers of treated sera were 
assayed as described above except that E. coli JM 1 09 (pUC9) whole cells were used as the antigen. 

It was found that rabbit anti-/ > . gingivalis antiserum had an antibody titer of 1 :64,000 to P. 
gingivalis and 1:160 to £. coli (pUC9), whereas normal rabbit serum had an antibody titer of 1:10 
5 to P. gingivalis and 1 :80 to £ coli (pUC9). Adsorption of anti-P. gingivalis antiserum with E. coli 

(pUC9) resulted in a slight reduction of antibody titer to P. gingivalis and reduced the anti-£. coli 
titer to zero or 1:10. 

Example 4 - Filter-Binding Enzyme Immunoassay 

1 0 Ampicillin-resistant transformants which formed white colonies in the presence of X-GAL 

were spotted onto LB agar plates with ampicillin, grown overnight, and blotted onto nitrocellulose 
filter disks. P. gingivalis and £ coli JM 1 09 (pUC9) were also spotted onto each filter as a positive 
and negative control, respectively. Duplicate prints of the colonies on nitrocellulose filters were 
made and colonies on one of each duplicate print were lysed by a 15-minute exposure to chloroform 

IS vapor. Filters were then air dried for 30 minutes and soaked for 2 hours in PBS containing 3% BSA. 

After the filters woe washed, adsorbed rabbit anti-P. gingivalis antiserum was added and the filters 
were incubated in a solution of peroxidase conjugated goat anti-rabbit immunoglobulin for 1 hour. 
After washing, the filters were developed in a color-forming substrate solution consisting of 0.06% 
4-chloro- 1 -naphthol and 3% hydrogen peroxide in a 1:4 solution of methanol-TBS (50 mM Tris 

20 hydrochloride, 200 mM NaCl, pH 7.4). Clones which developed a blue color were picked and 

rescreened by the same procedure. 

A total of 1 ,700 colonies of transformants resulting from HindLU restricted chromosomal 
DNA were tested for the expression of P. gingivalis antigens. Seven clones gave positive signals. 

25 Example 5 - Restriction and Southern Blot Analysis of Recombinant Plasmids 

To further confirm the positive results of the filter-binding enzyme immunoassay, plasmid 
DNA was isolated from each positive clone. Electrophoresis of these unrestricted plasmids showed 
that each clone contained only one recombinant plasmid. 

Southern blot analysis was also performed to confirm that the DNA inserts were derived 
30 from the P. gingivalis DNA. Plasmids were isolated from all the clones that were positive in the 

filter-binding enzyme immunoassay. Restriction endonuclease digestions were performed under 
conditions described by the manufacturer to produce complete digestion. Agarose gel 
electrophoresis was performed as described by Maniatis et al (1982, supra). 
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Recombinant plasmid and pUC9 vector DNAs were digested to completion with the 
appropriate restriction enzymes and run on a 1.2% agarose gel. P. gingivals DNA partially digested 
with Sau3A, and//7/»dm-digested Eikenella corrodens clone 18 DNA were also loaded in the gel. 
The DNA was transferred to "BIODYNE" nylon membrane by Southern transfer (Southern, E.M. 
5 [1975] "Detection of Specific Sequences Among DNA Fragments Separated by Gel 

Hectrophoresis.'V. MoL Biol. 98:503-517). P. gingival DNA partially digested with HindUl was 
nick translated with (a- 32 P dCTP) (400 Ci/mmol, Amersham Corp., Arlington Heights, 111.) as 
described by Maniatis et al. (1982, supra). The membrane-bound DNA was hybridized to the nick- 
translated probe at 42 °C in 50% formamide for 16 hours by the method recommended by the 
10 manufacturer (Pall Ultrafinc Filtration Corp., Glen Cove, NY) which was adapted from Wahl et al. 

(Wahl, G.M., M. Stern, G.R. Stark [1979] "Efficient Transfer of Large DNA Fragments from 
Agarose Gels to Diazobenzyioxy-Metbyl-Paper and Rapid Hybridization by Using Dextran Sulfate," 
Proc. Natl. Acad. Sci. USA 76:3683-3687). The membrane was washed at room temperature in 
wash buffer (2 x SSC and 0. 1% SDS) four times each for 5 minutes and twice at 50°C each for 15 
minutes in 0.1 x SSC, 0.1% SDS. An autoradiogram was obtained with Kodak XAR-5 film 
(Eastman Kodak Co., Rochester, NY) and Cronex Quanta U intensifying screen (DuPont Co., 
Wilmington, DE). 

Clones 1, 2, 4, 5, 7, and 8 were generated from WrndHI-restricted chromosomal DNA. After 
digestion withtfmdUL only clones 5, 6, 7, and 8 revealed fragments of the linear pUC9 vector and 
fragments of P. gingivalis DNA inserts. Plasmid DNAs of these clones were restricted with various 
enzymes and analyzed by gel electrophoresis. The estimated size of inserts of clones 5, 6, 7, and 8 
are 5.5, 5.5, 4.8, and 3.5 kb, respectively (Table 1). Thus clones 5 and 6 were found to contain 
plasmids of the same size and identical restriction fragments. 

Clone 3, which was constructed by ligation of Saul A partially digested P. gingivalis DNA 
with Bamm cut pUC9, was restricted with Smal sod Sail. Restriction analysis revealed a fragment 
of linear 9 bp-deleted pUC9 and 2 fragments of insert. Restriction analysis with different enzymes 
showed that the size of the insert of clone 3 was approximately 1.1 kb. 

Although clones 1, 2, and 4 were generated from Hi ndBl restricted DNA, they did not result 
in fragments of linear pUC9 after HmdEl digestion. These cloned DNAs were then restricted with 
PvuO, which generates a 307 bp fragment containing the polylinker-cloning sites from pUC9. 
Clones 1, 2 and 4 revealed fragments of linear 307 bp-deleted pUC9 and inserts associated with the 
deleted fragment. These cloned DNAs were digested with various restriction enzymes and analyzed 
by agarose gel electrophoresis. The size of inserts of clones 1, 2, and 4 were found to be 3.2, 3.2, 
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and 3.3 kb, respectively (Table 1). Clones 1 and 2 also contained plasmids of the same size and 
identical restriction fragments. 
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15 



Table 1. Characterization of £. coli transformants which express P. gingivalis antigens 

Colonies reacted with antiserum Size of B. gingivalis 



Clone No. 



1 = Positive reaction 

b = Negative, not reactive 



unlysed 



lysed 



DNA cloned (Kb) 



land2 




+ 


3.2 


3 


+ 


+ 


1.1 


4 


+ 


+ 


3.3 


5 and 6 


+ 




5.5 


7 




+ 


4.8 


8 


* 


+ 


3 5 



Example 6 - Assay of the Titer of Anti-P. gingivalis Antiserum to E. coli Transformants Which 
Express P. g in givalis Antigens 

20 Cultures of each representative clone were prepared by 100-fold dilution of overnight 

cultures and grown for 2 hours at 37 °C. Isopropyl-P-D-thiogalactopyranoside (IPTG) was added 
to specific cultures at a final concentration of 1 mM and the cells were pelleted by centrifiigation 4 
hours later. The cells were washed, resuspended in 1/10 volume of PBS, and the optical density of 
each suspension was determined at 550 nm. Cell lysate antigen was prepared by breaking the cells 

25 with a sonicator. The protein concentration of each lysate was determined by the Bio-Rad protein 

assay (BioRad Laboratories, Richmond, CA). Determination of the titer of anti-P. gingivalis 381 
against these antigens was performed with the ELISA as described above (10 8 cells or 1 \ig protein 
per well). Normal rabbit serum exhaustively adsorbed with E, coli JM109 (pUC9) was also tested 
in the same manner. 

30 Anti-P. gingivalis antiserum was able to detect antigen expression in all positive clones 

except clone 8 in an enzyme-linked immunosorbent assay (ELISA). The antiserum reacted with both 
whole cell and cell lysate antigens. Isopropyl- p -D-thiogalactopyranoside (IPTG) was not necessary 
to induce antigen expression. However, in the presence of IPTG, clones 2 and 3 showed higher 
antigen expression, especially when the cell lysate preparations were tested. These results are shown 

35 in Table 2. 
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Table 2. Titer of anii-P. gingivalis antiserum against E. coli 
transformants which express P. zineivalis antigens 

Antibody titers* against test antigens'* 



Organism 



Clone 1 
Clone 2 

Clone 3 

Clone 4 

Clone 5 

Clone 6 

Clone 7 

Clone 8 

E. coli JM109 
(pUC9) 

P. gingivalis 
Control NRS J 



Whole cell 



IPTG- 



IPTG* 



320 
320 

20 
20-100 
40-80 
40 
40 
0 

0-10 
40,960-64,000 



NT 
640 

160 
20-40 
40-80 

NT 

40 
0 

0-10 

NT 



CeU Lysate 



IPTG" 



320-640 
320-640 

40-160 
20-40 
40-80 

40 

40 

0 

0-10 
NT 



IPTG* 



NT 

1280- 
2560 

1280 

20-40 

40-80 

NT 

40 

NT 

0-10 
NT 



20 



25 



"Number designates the reciprocal dilution of the sera which gave OD 4n reading of 0.05 or more over the background. 

Antiserum was exhaustively adsorbed wtth£. coli JM109 (pUC9). 
'Antigens were prepared from cultures grown without IPTG (IPTG - ) or in the presence of IPTG (IPTG*). 
'Not tested. 

'Normal rabbit serum exhaustively adsorbed with E. coli JM109 (pUC9) did not react to test antigens. 



Exam ple 7 - Sodium Dodecvl Sulfate - Polvacrvlamide Gel Electrophoresis (SDS-PAGE) 

Five stable representative clones were analyzed for antigen expression by SDS-PAGE. Each 
of the representative antigen-producing clones was grown to mid-log phase in 3.0 ml of LB broth 

30 with 50 fig of ampicillin/ml. The cells were pelleted, washed with PBS, resuspended in 0.3 ml of 

sample buffer (62.5 mM Tris-hydrochloride, 5% 2-mercaptoethanol, 2% SDS, 10% glycerol, 
0.002% bromphenol blue, pH 6.8), and boiled for three minutes. The P. gingivalis cell lysate was 
mixed with an equal volume of sample buffer and treated in the same manner. 

SDS-PAGE was performed using a 12% polyacrylamide gel in a vertical slab gel 

35 electrophoresis tank (Hoefer Scientific Instruments, San Francisco, CA) as described by Laemmli 

(Laemmli, U.K. [1970] "Cleavage of Structural Proteins During the Assembly of the Head of 
Bacteriophage T4," Nature (London) 227:680-685). A whole cell preparation from clone 2 was 
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separated in a 5% SDS polyacrylamide gel and the expressed protein was initially estimated to have 
a molecular weight of more than 125 kD and later determined to be greater than 144 kD. 

Example 8 - Assay for Removal of SHA Adher ence inhibi tion bv Anti-P eineivalis Antiserum 
5 The expression of components detected by in vitro methods was subjected to further 

examination. The antigen-expressing clones described in the previous examples were tested for the 
expression of adhesins for saliva-treated hydroxyapatite (SHA adhesin). Anti-P gingivalis 381 
antiserum which inhibits the adherence ofP gingivalis 38 1 to SHA was adsorbed with each antigen- 
expressing clone until the titer of the antiserum to each clone was reduced to zero. Each adsorbed 

10 antiserum was tested for inhibition of P. gingivalis adherence to SHA. 

Porphyromonas gingivalis 381 was cultured in Todd-Hewitt broth. R coli transformants 
were cultured in LB medium containing 50 ng of ampicillin/ml by preparing 1 00-fold dilutions of 
overnight cultures followed by incubation for 2 hours at 37°C. IPTG was added to the cultures, 
when used at a final concentration of 1 mM, and the cultures were incubated for an additional 4 

15 hours. 

An assay for the removal of SHA adherence inhibition using anti-P. gingivalis antiserum 
was used to test for SHA adherence. In order to do this, aliquots of anti-P gingivalis antiserum were 
adsorbed with each antigen-expressing clone as well as E. coli JM109 (pUC9). The titer of each 
adsorbed antiserum was tested against each clone and P gingivalis whole cell antigen by ELIS A as 
20 described above. 

Whole paraffin-stimulated human saliva was collected and heated at 56 °C for 30 minutes 
to inactivate degradative enzymes. Extraneous debris and cells were removed by centrifiigation at 
12,000 rpm for 10 minutes and sodium azide was added to a final concentration of 0.04%. 

Hydroxyapatite (HA) beads (BDH Biochemical, Lt, Poole, England) were treated as 
25 previously described (Clark, W.B., L.L. Bammann, R.J. Gibbons [1978] "Comparative Estimates 

of Bacterial Affinities and Adsorption Sites on Hydroxyapatite Surfaces " Infect, lmmun. 19:846- 
853). Briefly, 10 mg of beads were washed and hydrated in distilled water in 250 jil plastic 
microfiige tubes followed by equilibrium overnight with adsorption buffer (0.05 M KC1, 1 mM 
pH 7.3, 1 mM CaCl 2 and 0. 1 mM MgClj). The beads were incubated with 200 nl of saliva 
30 for 24 hours at 4 °C and then washed with sterile adsorption buffer to remove nonadsorbing material. 

Control tubes without HA were treated identically. 

P gingivalis 38 1 cells were labeled by growth to late log phase in medium supplemented 
with CH) thymidine ( 1 0 mCi/ml). The cells were pelleted, washed twice in adsorption buffer, and 
dispersed with three 10-second pulses (medium power) with a microultrasonic cell disrupter. 
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The dispersed cells were mixed with each antiserum (1:100 dilution) and normal rabbit 
serum to a final concentration of 4 x 10 6 cell/ml. The cell-antiserum suspensions (200 were then 
frir fry* to the SHA beads in microfuge tubes and the tubes were rotated in an anaerobic chamber for 
1 hour. Labeled cells alone (no antisera) were treated in the same manner to determine the number 

5 of cells adhering to the SHA surface. A control tube containing cells but no SHA was tested to 

quantitate the amount of cells bound to the tubes rather than to the SHA. One hundred microliters 
of adsorption buffer containing unadhered cells was removed and placed in mini vials containing 3 
ml of aqueous scintillation cocktail (Amersham/Searle, Arlington Heights, IL), and counted with a 
scintillation counter (Model 455 Parkard Tricarb). Determination of the number of cells adhering 

10 to the SHA was done by subtracting the number of cells (no. of counts) in solution from the total 

number of cells (no. of counts) which did not adhere to the tube. 

The results in Table 3 summarize the SHA inhibition data and indicate that the antiserum 
adsorbed with each antigen-expressing clone still inhibited the adherence of P. gingivalis. 



15 



20 



25 



Table 3. Inhibition of adherence to SHA by adsorbed anti-P. gingivalis 


Inhibitor and dilution 




% adherence* 


% inhibition 1 * 


None 




83.85 




Normal rabbit serum 


1:100 


80.08 


0.05 


Antiserum unadsorbed 


1:100 


22.70 


72.15 


Antiserum adsorbed with: 


E. coli JM 1 09 (pUC9) 


1:100 


21.57 


73.07 


Clone 2 


1:100 


10.73 


86.59 


Clone 3 


1:100 


22.60 


71.78 


Clone 4 


1:100 


16.24 


79.71 


CloneS 


1:100 


27.37 


65.82 











■percent adherence was calculated from the following formula: % adherence - [(cpm from tube without SHA — cpm from 
tube with SHA)/(cpm from tube without SHA)] x 100. 
30 'Percent inhibition was calculated from the following formula: % inhibition ~ [ 1 — (°/« adherence in the presence of antibody 

/ % adherence in the absence of antibody)] x 100. 

Example 9 - Direct Hemagglutination Assay 
35 The rationale to identify the clones which express hemagglutinin were analogous to those 

described for the SHA adhesin. The anti-P. gingivalis antiserum adsorbed with each antigen- 
expressing clone and £. coli JM109 (pUC9), as described for the SHA assay, were tested for 
removal of hemagglutination inhibition activity of anti-P. gingivalis antiserum. Since it is necessary 



BNSDOCID: <WO_ ^9617936A2_L> 



wo 



96/17936 



PCI7US95/16108 



19 

to determine the minimum number of P. gingivalis cells which produce hemagglutinin before 
performing the hemagglutination inhibition assay, a direct hemagglutination assay of antigen- 
expressing clones together with P. gingivalis was first performed 

A direct hemagglutination assay was used to test for adhesion to erythrocytes. The 
5 hemagglutination assays were carried out in V-bottom microliter plates (Dynatech Laboratories, Inc., 

Alexandria, VA). Erythrocytes (sheep or human group O) were washed three times with PBS (0.02 
M phosphate buffered saline), pH 7.2, and rcsuspcnded to a final concentration of 0.2% (v/v). Cells 
of P. gingivalis and antigen-expressing clones were washed twice in PBS and resuspended to an 
optical density of 0.5 and 2.0, respectively, at 660 nm. The cell suspensions were diluted in a 

10 twofold series with PBS and 0.0S ml of the suspensions were added to the wells. E. coli JM109 

(pUC9), which was prepared in the same manner as the antigen-expressing clones, was included as 
a control. An equal volume (0.05 ml) of washed erythrocytes was added and mixed with the 
bacterial cells. The plates were stored for 16 hours at 4°C and then examined for evidence of 
hemagglutination as follows. Agglutinated erythrocytes will settle as clumps which will be dispersed 

15 throughout the bottom of the wells, resulting in a pinkish-red coating of each well. In the absence 

of hemagglutination, the erythrocytes will settle on the bottom of the well as a central, smooth, bright 
red round disk. The titer was expressed as the reciprocal of the highest dilution showing positive 
agglutination. 

The hemagglutination inhibition assay was also carried out in V-bottom microtiter plates. 
20 P. gingivalis cell suspensions in PBS were adjusted to the optical density of 0.5 at 660 nm. Each 

antiserum examined for hemagglutination inhibition activity was diluted twofold in a series of wells. 

Fifty microliters of the bacterial suspension containing twice the minimum number of cells which 

produced hemagglutination was then added to each well. After incubation with gentle shaking at 

room temperature for 1 hour, 0.05 ml of the washed erythrocytes were added to each well and mixed. 
25 The plates are left for 16 hours at 4°C and read for hemagglutination as described above for the 

hemagglutination assay. The titer was expressed as the reciprocal of the highest dilution showing 

hemagglutination inhibition. 

R coli transformants which were able to agglutinate erythrocytes were grown in LB broth 

containing ampiciilin as described above. Two rabbits were injected with each clone as previously 
30 described. Sera were exhaustively adsorbed with £. coli JM109 (pUC9) and tested for anti-P. 

gingivalis activity by ELISA. 

Anti-clone 2 antiserum diluted 1:10 was separately adsorbed with P. gingivalis , £. coli 

JM109 (pUC9), and clones 2, 5, and 7. Washed stationary phase cells of each bacterial culture were 

prepared as described above. For each adsorption, 10 7 , 10 8 , 10 9 and 10 10 bacterial cells were mixed 
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with 200 |il of serum and the suspensions were stored at 4°C overnight. The sera were recovered 
by centri&gation at 12,000 xg for 10 minutes. Each adsorbed antiserum was assayed by ELISA to 
determine the titer to P. gingivalis. 

The direct hemagglutination assay of these clones demonstrated that clones 2, 5, and 7 did 
5 agglutinate sheep erythrocytes, whereas E. coli JM 1 09 (pUC9) did not. The hemagglutination titer 

of clone 2 was 2 and that of clones S and 7 agglutinated erythrocytes at the undiluted suspension. 
In addition, clone S was found to auto-agglutinate when resuspended in PBS, pH 7.2. 



Example 10 * DNA Restriction Mapping and Characterization Procedures 

10 Restriction endonuclease digestions of the recombinant plasmids from clones 2, 5, and 7 

were performed according to manufacturer's directions. Clone 5 DNA was digested with HindUl and 
two fragments of P. gingivalis inserts were isolated from agarose gels by the method of Zhu et ai 
(Zhu, J. W. Kempenaers, D. Van der Straeten, R. Contreras, W. Fiers [ 1 985] "A Method for Fast and 
Pure DNA Elutionfrom Agarose Gels by Centrifugal Filtration," Biotech. 3:1014-1016) employing 

15 centrifugal filtration of DNA fragments through a Millipore membrane inside a conical tip. The 

DNA preparations were extracted with phenol-chloroform, precipitated with ethanol and resuspended 
in TE, pH 8.0. Each DNA fragment was ligated to //zndlll -digested pUC9 and the resulting 
recombinant plasmids were transformed into competent E. coli JM 1 09 cells as described previously. 
Recombinant plasmids from these transformants were isolated by rapid plasmid DNA isolation 

20 (Silhavy, T.J., M.L. Berman, L.W. Enquist [1984] Experiments with Gene Fusions, Cold Spring 

Harbor Laboratory, Cold Spring Harbor, NY), digested with appropriate restriction endonucleases, 
and analyzed by agarose gel electrophoresis. 

The recombinant plasmids of clones, 2, 5, and 7 were restricted with several restriction 
endonucleases and analyzed in 1.2% agarose gels. A schematic diagram of restriction enzyme 

25 recognition sites of these three clones is detailed in Figure 1. These data show that the clone 2 insert 

is different from that of clones 5 and 7, whereas clones 5 and 7 have one insert fragment in common. 
The restriction map of clone 2 revealed that the HindUL site of the DNA insert at the amino terminal 
end of the P-galactosidase gene was still intact, but a deletion occurred at the other end of the insert 
and included most of the linker. The linker region with recognition sites oTPstl, Sail, BamUl and 

30 Smal was deleted but the £coRI site was still intact as well as other sites upstream such as Pvull and 

Narl 

To further confirm the results of the restriction maps, 32 P-labeled clone 7 recombinant DNA 
was used as a probe for hybridization of restricted recombinant plasmids by Southern blot analysis. 
Clone 2 DNA restricted with HindUL, EcoKL, and Smal resulted in DNA fragments of pUC9 and four 
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pieces of insert of approximately 1,400, 1,300, 420, and 150 bp. Clone 5 DNA restricted with 
HindUl resulted in fragments of pUC9 and two pieces of insert approximately 4,800 and 760 bp. 
Fragment bands of pUC9 and inserts of approximately 2,800, 2,000, and 760 bp were generated 
from digestion of clone 5 DNA with Hindm and BamHL Clone 7 DNA restricted with HindlH alone 
and HindUl together with BamHl resulted in pUC9 and an insert of 4,800 bp, and pUC9, insert of 
2,800 and 2,000 bp, respectively. 

Hybridization of these transferred restricted DNAs demonstrated that the clone 7 probe 
hybridized to pUC9 and the common insert of clones 5 and 7 but not to the insert of clone 2. 

Clone 5 was found to agglutinate erythrocytes and autoagglutinate, while clone 7 was only 
able to agglutinate erythrocytes. Clone 5 has an insert of 760 bp in addition to the common insert 
of 4,800 bp of clone 7. This data suggested that the 760 bp insert might encode for the 
autoagglutinating activity and the 4,800 bp fragment for the hemagglutinating activity of clone 5. 
The recombinant plasmid of clone 5 was thus digested with HindUl to generate pUC9 and inserts 
of 4,800 and 760 bp. Each insert band was isolated from these transfonnants and digested with 
restriction endonucleases. Subclones with different orientations of the insert were obtained. 
Subclones of 760 bp inserts were desi gnat ed clone 5.1 and 5.2 and the subclones of 4,800 bp inserts, 
clone 5.3 and 5.4. Recombinant plasmids of clones 5. 1 and 5.2 digested with /findm did result in 
pUC9 and the 760 bp inserts, and different patterns of restricted DNAs were seen when digested 
with Sail. i//ndm-restricted recombinant plasmids of clones 5.3 and 5.4 revealed pUC9 and inserts 
of 4,800 bp, while £coRI-restricted recombinant plasmids showed different patterns. Both clones 
5. 1 and 52 were able to autoagglutinate when resuspended in PBS, pH 7.2, but could not agglutinate 
erythrocytes. Clones 5.3 and 5.4 were both able to agglutinate erythrocytes but did not 
autoagglutinate. 

Example 1 1 - Identification and Characterization of Gene Products by Sodium Dodecvl Sulfate* 
Polyacrylamide Gel Electrophoresis (S DS-PAGEV Western Blot Minicell Analysis, and 
Immunoaffinitv Chromatography 

P. gingival} s cell lysate and cells of E, coli transfonnants were prepared and analyzed by 
SDS-PAGE as described above and Western blot as described by Burnette (Bumette, W.N. [1981] 
"Western Blotting: Electrophonetic transfer of proteins from sodium dodecyl sulfate-polyacryiamide 
gels to radiographic detection with antibody and radioiodinated protein A," Anal Bwchem. 112:195- 
203). Antisera to clones 2, 5, and 7 exhaustively adsorbed with E. coli JM109 (pUC9) were used 
as probes in the Western blot. Control antisera included anti-clone 2 antiserum also adsorbed with 
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P. girtgivalis at the ratio of 10 10 cells per 100 fil of antiserum, and antiserum to £. coli JM109 
harboring pUC9 with Eikenella corrodens DNA insert. 

Upon Western blot analysis of clone 2, a protein antigen of approximately 125 kD and a 
smear of lower molecular weight were detected using E. coli adsorbed anti-P. gingivals antiserum 
5 but no corresponding antigens expressed in clones 5 and 7 were detected by Western blot analysis. 

Clones 5 and 7 did, however, express a protein detected as a major band of approximate M.W. 49-50 
kD by Western blot analysis and revealed an additional minor band of 27 kD upon minicell 
autoradiography. 

For the identification of clones 5 and 7 gene products, the minicell procedure was used as 

10 described by Clark-Curtiss et al and Dougan et al (Clark-Curtiss, J.E., FL Curtiss IE [1983] 

"Analysis of Recombinant DNA Using Escherichia coli Minicells " Methods Enzymol. 101 :347- 
362; Dougan, G„ M Kehoe [1984] "The minicell system as a method for studying expression from 
plasmid TSUK^ Methods Microbiol 17:233-258). Recombinant plasmids were transformed into 
E. coli as previously described. Transformants were selected on LB plates containing SO jig/ml 

15 ampicillin and 10 mM isopropyl- P -D-thiogalactopyranoside (IPTG). Colonies were streaked for 

isolation and grown overnight at 37°C in BSG (phosphate-buffered saline + 0.01% gelatin) 
containing 50 jig/ml ampicillin Minicells were then isolated by sequential low speed centrifugation, 
high speed centrifugation of the low speed supernatant fluid, and centrifugation through a 5-30% 
(w/v) sucrose gradient. The sucrose gradient centrifugation was repeated at least once. The 

20 minicells were collected and diluted twofold in BSG, pelleted by centrifugation at 10,000 rpm for 

10 minutes, and the resulting pellet was resuspended in minicell labeling medium containing no 
methionine. After incubation of the minicell suspension for 10 minutes at 37°C, 10 jaCi of 35 S- 
methionine were added Following a 15 minute incubation, the cells were chilled for 10 minutes on 
ice and pelleted by a two minute centrifugation in a microfuge. The cell pellets were then processed 

25 for SDS-PAGE. Autoradiography was performed on 35 S-methionine labeled minicell preparations 

which were electrophoresed on a 12% SDS-PAGE. 

In order to determine the native P. gingivalis antigens which clone 2 expressed, antisera 
against clone 2 were made in rabbits for use as a probe in Western blot analysts. Pooled anti-clone 
2 antiserum had a titer of 1 : 16,000 against P, gingivalis whole cell antigen. This antiserum was 

30 adsorbed exhaustively with E. coli JM109 (pUC9) until the anti-£. coli titer was reduced from 

1 :5 0,000 to 1 : 10 in the E, coli whole cell ELISA. The adsorbed antiserum, diluted to 1 :200, was 
used as a probe to detect antigens separated in a 12.5% SDS polyacrylamide gel and transferred to 
a nitrocellulose sheet This antiserum reacted with two major bands of approximately MWs 43,000 
and 38,000 and two bands of MWs 32,000 and 30,000 in P. gingivalis cell lysate antigen and the 
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125 kD protein band of expressed antigen in clone 2. Normal rabbit serum reacted to a common 
40,000 molecular weight band of all the clones and £. coli JM109 (pUC9). 

In order to prove that the P. gingivalis reactive polypeptides are exclusively P. gingivalis 
proteins, the native P. gingivalis antigens were reacted to £1 coli adsorbed anti-clone 2 antiserum, 
5 P. gingivalis cell lysate antigen and clone 2 whole cell antigen were again separated in 12.5% SDS- 

po ly aery 1 amide gel. Upon transfer to a nitrocellulose sheet, each was reacted with (1) E. coli 
adsorbed and -clone 2 antiserum, (2) P. gingivalis adsorbed anti-clone 2 antiserum, and (3) antisera 
to E coli JM109 harboring pUC9 with an Eikenella corrodens DNA insert. £. coli adsorbed anti- 
clone 2 reacted to P. gingivalis cell lysate at two major bands of MWs 43,000 and 33,000, two 

10 bands of MWs 32,000 and 30,000 and three faint bands of higher molecular weight of 

approximately 1 10,000, 90,000 and 75,000 dahons. This adsorbed antiserum also reacted to a band 
of expressed antigen having a molecular weight greater than 125 kD in clone 2. 

To define the native P. gingivalis antigens which clones 5 and 7 expressed, antisera against 
clones 5 and 7 were also made in rabbits and had titers of 1:800 and 1:1,600 to P. gingivalis 

15 antigens. These antisera exhaustively adsorbed with E. coli were used to identify the reactive native 

P gingivalis antigens. Antisera against clones 5 and 7 at the dilution of 1 :5 and 1:10 were found 
to react with two bands of approximately 43,000 and 38,000 dahons in P. gingivalis cell lysate 
antigen preparation but did not react to the expressed clone 2 antigen. This antiserum also reacted 
to a common band of approximately 36,000 daltons of E. coli antigen in each clone and £. coli 

20 JM 1 09 (pUC9). Normal rabbit serum did not react to any P. gingivalis antigens. 

Immunoaffinity chromatography was used to identify and purify the native P. gingivalis 
gene product and to verify that inserts of clones 5 and 7 contained the entire gene. Immune rabbit 
IgG was purified via DEAE cellulose. Following the precipitation of IgG by the addition of 
saturated ammonium sulfate to the sera, the IgG was coupled to "AFFI-GEL" (Bio-Rad 

25 Laboratories, Richmond, CA) by incubation for two hours at room temperature and overnight at 

4°C. The coupled material was then used to prepare a 3 cm 3 column. After the column was washed 
extensively with 0.02 M phosphate buffer, pH 8.0, 1-2 ml of P. gingivalis 381 sonicate containing 
1 8 mg/ml protein were added and run through the column using a peristaltic pump generating a flow 
rate of 20 ml/hr. The column eluate was monitored for absorbance at 280 nm. The column retentate 

30 was eluted from the column by addition of 0. 1 M glycine, pH 2.5. The recovered retentates were 

concentrated by centrifugation through a molecular weight cut-off filter, pressure concentration in 
an Amicon filter (Amicon, Danvcrs, MA), lyophilization, or a combination of the above. When a 
P- gingivalis 381 cell lysate was applied to an affinity column containing anti-clone 7 rabbit IgG, 
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and the retained antigenic peptides were eluted and analyzed by SDS-PAGE, a major band at 49-50 
kD was evident. 

Example 12 - Determination of the Relationship Between the Expressed Antigens of Clones 2. 5 and 
5 2 

Although antisera against clones 2, 5, and 7 reacted to P. gingivalis cell lysate at two major 
bands of 43,000 and 38,000 MWs, £ coli adsorbed anti-clone 2 antiserum also reacted to the greater 
than 125 kD protein band synthesized in clone 2. However, K coli adsorbed anti-clone 5 and anti- 
clone 7 antisera did not react to this expressed antigen band of clone 2. 

10 To further define the relationship of the epitopes of the expressed antigen in clone 2 from 

that of clones 5 and 7, adsorption of anti-clone 2 antiserum with several antigens was performed and 
each adsorbed anti-clone 2 antiserum was tested for its titer to P. gingivalis whole cell antigen by 
ELISA. The antibody titer to P. gingivalis of anti-clone 2 antiserum was removed in a dose response 
manner by adsorption with P. gingivalis and clone 2 cells. Adsorption with E. coli JM 109 (pUC9), 

15 clone 5 or clone 7 did not reduce the antibody titer to P. gingivalis of anti-clone 2 antiserum. 

The ability of antisera to P. gingivalis and hemagglutinable E. coli to inhibit the 
hemagglutinating activity of P. gingivalis was determined and is summarized in Table 4. All 
antisera inhibited P. gingivalis hemagglutination at titers four to eight times that of normal rabbit 
sera. 



20 



Table 4. Inhibition of hemagglutinating activity of -P. gingivalis by anti-hemagglutinating E. 
coli antisera. 

Antiserum Hemagglutination inhibition titer 

Anti-P. gingivalis 

25 unadsorbed 640 

adsorbed with E. coli JM 1 09 
(pUC9) 640 

Normal rabbit serum* 160 

Anti-clone 2 320-640 

30 Preimmune 80 

Anti-clone 5 160 

Preimmune 40 

Anti-clone 7 160 



35 



"Normal rabbit serum and preimmune sera titers are from each particular group of rabbits. 
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Example 13 - DNA Sequencing of P. zinzivalis Hemagglutinin Genes 

The P. gingtvalis 38 1 chromosome contains at least five genes which encode hemagglutinin. 
The P. gingtvalis genes encoding hemagglutinin proteins have been designated hagA, hagB, hagC, 
hagD, and hagE. Genes encoding hemagglutinins were cloned using standard procedures as 
S described fthpyf or with minnr HirvtiftratinnR readily recognized and understood in the art Plasm id 

DNA was isolated from the transformed hosts by a rapid method wherein DNA samples for 
sequencing were prepared by alkaline-lysis/PEG precipitation method Briefly, transformed £. coli 
JM 109 cells growing in 50 ml Terrific broth with ampicillin were collected (ca. 0.5 g wet weight) 
and rcsuspended in 2 ml of 50 mM glucose, 25 mM Tris/Cl (pH 8.0), and 10 mM EDTA (pH 8.0). 

10 A freshly prepared 4 ml solution of 0.2 N NaOH, 1% SDS was added and left on ice for 10 minutes. 

Then 3 ml of ice-cooled potassium acetate solution was added and left on ice for 10 minutes. The 
mixture was centrifuged 30 minutes at 9,000 rpm at 4°C and RNase A was added to a final 
concentration of 20 jig/ml to the supernatant and incubated for 20 minutes at 37 °C. The mixture 
was extracted thoroughly with chloroform/isoamyl alcohol. An equal volume of isopropanol was 

15 added to precipitate DNA, left for 1 0 minutes at room temperature, and centrifuged for 30 minutes 

at 9,000 rpm at room temperature. The DNA pellet was dissolved in 3.36 ml of H 2 0. Then 0.64 
ml of 5 M NaCl and 4 ml of 13% PEG 8000 (polyethylene glycol, Sigma) were added and left on 
ice for more than 1 hour. After centrifiigation for 15 minutes at 9,000 rpm at 4°C, the DNA pellet 
was dissolved in sterilized water. By this method, 200 to 400 jig of highly purified plasmid DNA 

20 can be obtained in one day. 

A. Characterization of the hazA gene and gene product . The hemagglutinin gene designated 
hagA was obtained from the P. gingtvalis 38 1 -derived clone ST 2, and was determined to be more 
than 4500 bp in length. The sequence of the ST2-derived DNA sequence is shown in SEQ ID NO. 
1 . The open reading frame (ORF) of the hagA gene from clone 2 was determined to encode a 

25 polypeptide of at least 1339 amino acids, and > 144 kD. The derived amino acid sequence encoded 

by the hagA gene from clone 2 is shown in SEQ ID NO. 2. A 10,1 19 bp EcoKV fragment was 
cloned that included an additional 338 bp of upstream sequence. The complete open reading frame 
(ORF) of hagA was found to be 7,887 bp in length (bases 365 to 8251 of the EcoKV fragment), 
encoding a protein of 2,628 amino acids with a molecular weight of 283.3 kD. The nucleotide and 

30 deduced amino acid sequences of the entire hagA gene are shown as SEQ ID NO. 13 and SEQ ID 

NO. 14, respectively. It was initially found that the hagA sequence has an approximately 1. 1 kb 
repeating unit which repeats at least four times and may repeat as many as six times, with only minor 
differences in the repeat unit. Further analysis confirmed that the hagA gene has four large 
contiguous direct repeats totalling 5,404 bp in length, each ranging from 1,318 to 1,368 bp in length. 
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Specifically, these approximately 1.3 kb repeat fragments, collectively referred to hereinafter as 
HArep, are (referring to bp number of EeoRV fragment): HArep J 9 bp 1 862-32 1 1 (SEQ ID NO. 1 5); 
HArep2 9 bp 3212-4579 (SEQ ID NO. 17); HArep3, bp 4580-5947 (SEQ ID NO. 19); and HArep4 9 
bp 5948-7265 (SEQ ID NO. 21). The deduced amino acid sequences for the nucleotide repeat 
5 fragments HArepl. HArep2 t HArep3 9 and HArepA are shown as SEQ ID NOS. 16, 1 8, 20, and 22, 

respectively. This repeat unit has been shown to have hemagglutinin activity. The results of the 
hemagglu t i nin assay for strains having varying numbers of HArep repeat units are shown in Table 
5, below. 



10 Table 5. Hemagglutinin titer 

Strain No. of HArep HA titer 

381 (wild-type strain) >4 0.13 1/128 

pNH9 1 3 1/8 

pNHJ 2 0.85 1/64 

15 E coh 0 



When compared with that of hagA, several reported protease genes were found to contain 
at least one copy of the HArep sequence. For example, prtH 9 a gene encoding a C3 protease cloned 

20 from strain W83, shares a region of 271 amino acids with 95.6% homology to hagA. Rgp-1, the 

arginine-specific cysteine protease/hemagglutinin gene cloned from strain H66, contains a 522- 
amino acid region with 93. 1% homology, as well as prtR cloned from strain W50. Agp, cloned from 
strain 38 1 by Okamoto et al , and prpR, cloned by Curtis ex al , which are identical genes to rgp-1 
isolated from different strains, each contain one HArep sequence of hagA, An additional gene, agp, 

25 which is missing a 7 13-amino acid internal portion of rgp-J , also contains one HArep sequence. In 

addition,/?^, a cysteine protease/hemagglutinin gene cloned from strain W 12 and described herein, 
has an 849-amino acid C -terminal region which shares 92.2% homology to hagA, with the last 253 
amino acids (almost half of the length of the prtP gene) absolutely identical. 77a, another protease 
gene cloned from strain W50 by Curtis et al, has a 789-amino acid C -terminal region with 95.2% 

30 homology to hagA, with the last 1 7 1 amino acids completely identical. This 1 7 1 -amino acid region 

constitutes almost three-fourths of the length of the TLA gene. In addition, hagD, a fourth 
hemagglutinin gene cloned from strain 381, described hereinbelow, has a 523-amino acid region 
with 92.7% homology, as well as the 3' 72-amino acid with 98.6% identity to hagA. HagE, an 
additional hemagglutinin gene cloned from strain 381, also described hereinbelow, contains a 5 1 8- 
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flminn acid region with 92.3% homology to hag A. Without exception, these high homology regions 
of each of these genes are within or extend from the repeat region of hagA. The hagA is a central 
member of a multigene family which share the HArep sequence. 

In addition, each of these genes contains a common 72-amino acid C-tenninus with hagA 
5 (81.9 to 100% homology), except for prtH, in which this region is located in the middle of the gene. 

A search through the National Center for Biotechnology Information Database using the 
GENINFO Experimental Blast Network Service revealed no significant homology of hagA to any 
other sequences in the databases except for the Mycoplasma gallisepricum hemagglutinin genes 

10 (pMGA) and the circumsporozoite protein genes of Plasmodium falciparum. These genes were 

found to have homology to hagA in very short regions (9 of 1 3 amino acids for the circumsporozoite 
protein of P. falciparum and 1 1 of 14 amino acids for pMGA ofM gallisepricum). 

To ensure that the complete hagA gene sequence was isolated from clone 2, chromosome 
DNA samples were digested by restriction enzymes which did not cut the original cloned fragment 

15 clone 2, including Accl, Asel 9 (Biolabs) Vspl (the isoschizomer from Promega), BclL, BglU, BstXL, 

Dral (BRL),£roRV,#ruI (Stratagene), Pstl, Pvz/H, Sail Sphl y Sspl, Sstl (Sigma), S/uI, and AZtoI. 
The digested fragments were transferred to positive-charged nylon membranes (Boehringer 
Mannheim Biochemicals, Indianapolis, IN) by capillary transfer method. The whole ST2 fragment 
was labeled and detected by nonradioactive Genius Kit (Boehringer Mannheim Biochemicals). 

20 Alternatively, a region of the first 394 bp of clone 2, which is distant from the repeat sequence 

region, was labeled using the nonradioactive DIG DNA Labeling and Detection Kit (Boehringer 
Mannheim) and used as a probe to detect the bound DNA fragments on the nylon membrane. The 
results were made visible on X-Ray films by Lumi-phos 530 system (Boehringer Mannheim 
Biochemicals). 

25 Inverse polymerase chain reaction (IPCR) was employed to determine the complete sequence 

of a gene, and was used to obtain the flanking 5 ' and 3 ' sequences and thus the entire nucleotide 
sequence of the hagA gene. To cany out the IPCR procedure, two 1 8-mer oligo primers, negative 
primer at position nt 224 and positive primer at position nt 2032, were chosen and synthesized at 
University of Florida DNA Synthesis Core Lab. In addition, a negative primer at 405 nucleotide (t) 

30 upstream of the 5' end of the ST 2 fragment (GGC AAA CCA AAA AGA TTC, SEQ ID NO. 23) 

and a positive primer at 529 nt 3 ' of the ST 2 fragment (TTC TTC C AA CGA CTA C AC, SEQ ID 
NO. 24) were selected and synthesized at the University of Florida DNA Synthesis Core Facility. 

The total Asel (Vspl) digested fragments and the 3-7 kb fragments extracted from agarose 
gel were self-ligated at a DNA concentration of 1-10 ng/^xl with 1 U of T4DNA ligase (Promcga) 
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per 50 pi reaction mixture for 16 hours at 16°C, respectively. Then, the ligation mixture was heated 
far 15 minutes at 65 6 and extracted with phenol/chloroform, chloroform, precipitated with ethanol 
and resuspended in sterilized distilled water. IPCR reactions were performed in 2 steps: first, the 
self-ligated DNA sample in buffer was heated for 30 minutes at 94 °C; then, Taq polymerase 

5 (Promega) was added and cycled using a PTC- 100 Programmable Thermal Controller (MJ Research, 

Inc., Watertown, MA). We used 35 cycles of denaturation at 94 °C for 1 minute, primer annealing 
at 52°C for 1 minute, and extension at 72°C for about 5 minutes. 

The amplified mixture was extracted with phenol/chloroform, chloroform and 
electrophoresed at 1% low melting agarose gel The excised fragment was then treated with agarase 

10 (Boehringer Mannheim Biochemicals). The DNA samples treated with agarase are purified enough 

for direct sequencing. After analysis of direct sequencing data, the amplified EPCR fragment was 
cut by HindLIl and Kpnl and cloned into pBluescript II SK and transformed in E. coli JM 109. 
Several subclones were constructed and one oligo primer was also synthesized to complete the 
sequencing. 

15 Sequencing of the hagA gene was carried out at the University of Florida DNA Sequencing 

Core lab using the Taq Dye Primer and Taq Dyedeoxy Terminator Cycle Sequencing Protocol 
developed by ABI (Applied Biosystems, Inc., Foster City, CA) with fluorescent labeled primer(s) 
and labeled dideoxy nucleotides, respectively. The labeled extension were analyzed on an ABI 373 
DNA Sequencer. Sequence data were analyzed by the Sequence Analysis Software Package of the 
20 University of Wisconsin. 

Southern blot analysis results indicated ihatAsel restriction of genomic DNA produced a 
single 6.9 kb fragment which hybridized to the probe used. Under the conditions used, as described, 
a 5,963 bp fragment was successfully amplified via IPCR which, when sequenced, was found to 
include an additional 2,997 bp sequence 3 ' to the ST 2 fragment. The start codon was found to be 
25 located 720 bp upstream of the 5' end of the ST 2 fragment. In order to obtain the 3' end of this 

gene, a BamW gene bank was constructed from which an 8,818 bp cloned fragment containing an 
pri dirionfll 3362 bp downstream DNA was obtained. Sequencing this downstream region revealed 
that the stop codon was located 1,017 bp downstream of the 3' end of the 6.9 kb Asel fragment. 

The complete ORF of hagA beginning at base No. 365 and ending at base No. 825 1 is 
30 calculated to encode a 2628-amino acid protein with a molecular weight of 283.3 kD. Analysis of 

the sequence revealed potential -10, -35 consensus sequences located at bases 168 and 143, 
respectively. However, no £ co/>-like ribosome binding site was found upstream of the start codon 
except for AGG at the -4 to -2 position. Two potential stemloop structures, forming 14 and 9 bp- 
long inverted repeats were identified 51 and 101 bp downstream of the stop codon, respectively. 
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Residues No. 5-2 1 are consistent with a typical, hydrophobic leader or signal sequence 
according to the Chou-Fasman Prediction. In addition, Chou-Fasman rules predict the beginning 
amino acids oiHArep to be very antigenic and hydrophilic. The amino acid sequence which begins 
each of the HAreps* is very similar to a region of M. gallisepticum hemagglutinin genes. The 
5 common repeating amino acid sequence (Pro-Asn) among P. gingivalis and M. gallisepticum 

hemagglutinin genes listed above indicates that this region is involved in erythrocyte binding. 

The repeat region was found to begin immediately after the first Kpnl site at base No. 1 862 
and to end at base No. 7265, making the entire repeat region 5,404 bp in length without a single gap. 
The first repeat unit (HArep 1) is 1,350 bp and has 99.5% identity to the second repeat unit. The 

10 repeat units HArep 2 mi HArep 3 are 1368 bp in length and are 99.9% identical to each other. The 

fourth repeat unit (HArep 4) is 1,3 1 8 bp in length and has 98.6% identity to HArep 2 and HArep 3, 
respectively. As shown in SEQ ED NO. 16, the beginning amino acid sequence of the HArep 1 is 
"Pro Asn Pro Asn Pro Gly Thr Thr Thr ..." while that of the other three is "Gly Thr Pro Asn Pro 
AsnProAsnProAsnProGly ThrThrThr ..." (see SEQ IDNOS. 18, 20, and 22). Thus, HAreps 

15 2-4 at the very beginning contain six amino acids more than HArep 1. This difference is due to 

HArep 1 containing two fewer repeats of the Pro-Asn sequence since the Gly-Thr is present before 
the sequence of "Pro Asn Pro Asn Pro Gly Thr Thr Thr ..." inHArep 1. 

Another distinguishing characteristic of the hagA multigene family is the presence of a 72- 
amino acid sequence normally at the extreme carboxy terminus of the proteins. This region is 

20 hydrophobic according to the Chou-Fasman Prediction and can serve to anchor the proteins in the 

outer membrane or serve in some other common recognition function. 

The hemagglutinin (HA) encoded by the hagA gene can have the characteristics of a cysteine 
protease, a trypsin-like protease, and a hemagglutinin. Hemagglutinins of Porphyromonas 
gingivalis can be involved in virulence. The HAs of P. gingivalis are nonfimbral adhesins, since 

25 biochemical studies have shown that the purified fimbrillin subunit of P. gingivalis failed to 

agglutinate red blood cells or to inhibit hemagglutination by P. gingivalis^ and immunological 
studies have shown that monospecific antibody against the hemagglutinin did not bind strongly to 
the fibrillar structures of P. gingivalis. 

It has been previously suggested that protease and hemagglutination activities of P. 

30 gingivalis are related. One study reported that mutant strains of P. gingivalis deficient in trypsin- 

like protease activity had markedly reduced hemagglutination activity. Others have reported that a 
44 kD purified outer membrane hemagglutinin has been further characterized as a cysteine protease. 
The DNA sequence of hagA was compared with the DNA sequence of an approximately 4.5 kb 
fragment of genomic DNA from the XFBP1 clone made from the of P. gingivalis W12 strain. The 
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gene from the iFBPl clone was isolated and named prtP (see section F of this Example, below). 
The prtP gene encodes protein(s) reactive with antibody that inhibits a cysteine protease of P. 
gingiva lis W12, and that binds a fibrinogen. The nucleotide sequences oihagA and prtP were 
compared, and were found to contain internal regions approximately 2 kb in size that share a high 
S degree of sequence similarity. The hagA gene contains three regions that share greater than 90% 

sequence identity with prtP. These regions include a 2 1 7 bp sequence in which there is 90% identity, 
and a 884 bp sequence in which there is 94% identity and a 500 bp sequence in which there is 97% 
identity. These findings raise the possibility of relatedness between fibrinogen binding protein and 
a hemagglutinin of P. girtgivalis. 

10 B. Characterization othagB gene and gene product . The gene encoding a hemagglutinin 

hagB was obtained for sequencing from P. gingivalis on a 2.0 kb HindUl BamUl fragment and 2.4 
kb BamHl-EcoRl fragment cloned into pUC9 and transformed into E. coli JM 1 09. These fragments 
were subcloned into the M13 bacteriophage vectors for sequencing (Yannish-Peron, C, J. Viera, J. 
Messing [1985] "Improved M13 phage cloning vectors and host strains: Nucleotide sequences of 

15 M13mp 18 and pUC9 vectors/' Gene 33: 103-1 19). The entire lengths of these fragments were 

sequenced utilizing the universal priming site of Ml 3 and by synthesizing oligonucleotide primers 
for the remaining regions of the fragments. The sequencing of the 1 .7 kb Kpnl-Pstl fragment and 
the DNA adjacent to the BamHl site ensured that the 2.0 kb and 2.4 kb fragments were contiguous. 
£. coli JM109 was used as the host strain for transfection with M13 and grown in 2x YT broth. 

20 Recombinant phages were detected by using soft agar (0.75%) overlays of 2x YT broth base 

supplemented with 0.33 mM isc^ropyl-beta-I>thiogalactopynttiosi^ (IPTG) and 0.02% 5-bromo-4- 
chloro-3-indo!yI-3-galactoside (X-GAL). 

Restriction enzymes, T4 DNA ligase, and M 13 1 7-mer primer were purchased from either 
Bethesda Research Laboratories (Gaithersburg, MD) or Fischer Scientific Co., St. Louis, MO) and 

25 ♦ were used in accordance with the specifications of the manufacturers. Other oligonucleotide primers 
were synthesized by the Molecular Biology Resource Facility (Oklahoma City, OK). Sequencing 
reagents were from the T7 Sequencing Kit of Pharmacia (Piscataway, NJ) or the Sequenase DNA 
sequencing kit of U.S. Biochemical Corp. (Cleveland, OH). The [a- 35 S]dATP was purchased from 
DuPont, NEN Research Products (Boston, MA). IPTG and X-GAL were purchased from Sigma 

30 Chemical Co. (St. Louis, MO). 

DNA sequencing was performed by using the dideoxy chain-termination method (Sanger, 
F., S. Nicklen, A.R. Coulson [1977] "DNA sequencing with chain terminating inhibitors," Proc, 
Natl Acad. Set USA 74:5463-5467). Different portions of each fragment were sequenced from 
synthesized oligonucleotide primers. The DNA sequence of the gene was determined for both 
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strands and was analyzed by the James M. Pustell DNA and protein sequencing program 
(International Biotechnologies, Inc., New Haven, CT). The nucleotide sequence of the hagB 
hemagglutinin gene is 1053 nucleotides in length as shown in SEQ ID NO. 3. The mol.% G+C 
content is 59.9%. The reading frame of the hemagglutinin gene was defined by a putative ribosome 
5 binding site and promoters upstream of the ATG start codon and potential stem-loop structures 

downstream of the stop codon Beginning 181 to 239 bases upstream of the two potential promoters 
was a region of direct repeats. A sequence of 4 1 nucleotides was repeated four times contiguously 
with only mines* differences. Open reading frames were also identified on the opposite strands both 
upstream and downstream of the hemagglutinin gene. 

* 0 The amino acid sequence of the hemagglutinin was derived from the nucleotide sequence 

and determined to be 350 residues in length. The derived protein of M,=39,375 was basic with an 
isoelectric point of 8.98 and hydrophilic. A potential signal peptide is evident. Cleavage is most 
probable after amino acids 32-36, though none of these sites conforms ideally to the -3,-1 rules of 
von Heijnc. The derived amino acid sequence encoded by the hagB gene is shown in SEQ ID NO. 

15 4. 

Comparison of the nucleotide and derived amino acid sequences with the gene and protein 
bank libraries did not uncover any significant homology between the hemagglutinin and previously 
determined sequences. 

Upstream from the hemagglutinin reading frame were two potential promoters which in turn 
20 were preceded by a series of direct repeats. The function of the direct repeats is not known but it 

would be reasonable to hypothesize that they have a role in gene expression. 

The codon usage for the hemagglutinin was examined and found to follow the pattern for 
a gene with low level expression, though this pattern was broken in a few instances. In general, the 
pattern for low expression consists of a low U/C ratio in the third base position of the codon for 
25 some amino acids, but a high U/C ratio in the third position for other amino acids. Perhaps due to 

the high %G+C content of the hemagglutinin gene a low U/C ratio existed for most amino acids. 
Overall, however, the codon usage followed the pattern for low expression more often than that for 
high expression. The usage of some codons which specify rate tRNA species in £. coli may also be 
evidence of a lower level of expression of the hemagglutinin gene. Alternatively, the same tRNA 
JO species may not be rate limiting in P. gingivalis but could explain the difficulty in expressing the 

cloned product in £. coli. 

C. Characterization o f the havC gene and gene product . A third hemagglutinin gene, 
designated hagCvns isolated from Porphyromonas gingiva lis 381. The nucleotide sequence of the 
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hagC gene is shown in SEQ ID NO. 5 and has a 1050 bp coding region. The derived amino acid 

sequence is shown in SEQ ID NO. 6. 

The hagC gene was isolated in a similar manner as the hagB gene. Briefly, isolated P. 
gingivalis 38 1 chromosomal DNA was digested with HindBl and electrophoresed through a 0.8% 
5 agarose gel in Tris-acetate buffer. A band of agarose containing the fragments ranging from 4 to 20 

kb was cut out of the gel and the DNA extracted using a phenol freeze/thaw procedure. The DNA 
was ligated to the dephosphorylated Hindm restricted pUC18 plasmid (Pharmacia LKB 
Bi<>technotogy,Piscau^ • ^ 

recombinant plasmids were transformed into E coli DH5a (BRL) and plated on LB plates 
10 supplemented with ampicillin, IPTG and X-GAL. Colonies were picked on duplicate plates and 

grown aerobically at 37°C overnight The clones from one of the duplicated plants were transferred 
to positively charged nylon membranes (BM Corp.) and lysed according to the procedure described 
by Sambrook et al. The membranes were then left to dry for 30 minutes and baked at 120°C for 30 
minutes. The hybridization was carried out as described above; however, a 960 bp Bamm-Pstl 
15 DNA fragment from hagB gene was used as a probe. 

Recombinant plasmid DNA was prepared using the alkaline lysis method, modified as 
described. The cells were grown in LB broth supplemented with 50 ug/ml ampicillin. The closed 
circular DNA was purified by equilibrium centrifugation in a continuous CsCl-cthidium bromide 
gradient. DNA further destined for sequence was additionally submitted to precipitation with 

20 polyethylene glycol. 

Double stranded DNA sequencing was performed by the University of Florida 
Interdisciplinary Center for Biotechnology Research DNA Sequencing Core laboratory. Sequencing 
was accomplished by employing the Taq Dye Primer and Taq Dye Terminator cycle sequencing 
protocols (Applied Biosystems, Inc., Foster City, CA) using the fluorescent primers and 

25 dideoxynucleotides, respectively. The labeled extension products were analyzed on an ABI373a 

DNA sequencer (Applied Biosystems, Inc.). The sequence was obtained for both strands of DNA 
using the appropriate subclones or synthetic oligonucleotides synthesized by the University of 
Florida DNA Synthesis Core Faculty, the sequencing strategy was designed to sequence overlapping 
sites used in DNA subcloning. The sequence was analyzed with the Genetic Computer Group 

30 Sequence analysis software. 

The 1851 bp //mdlll-Ssdl DNA fragment comprising the hagC gene revealed an open 
reading frame (ORF) of 350 amino acids corresponding to a 39.3 kD protein with an isoelectric point 
of 8.36. The ATG start site, located at position 374 of the DNA, is preceded by putative -10 
(^TATTAT 144 ) and -35 (""TTGCTG sequences which differ from the E. coli consensus promoter 



BNSDOCID <WO 9617936A2_I_> 



WO 96/17936 



PCT7US95/16108 



33 

sequences TATAAT and TTGACA, by one and three nucleotides respectively. However, no match 
to consensus Shine-Dalgarno sequence could be found upstream the ATG codon. A nearly perfect 
dyad symmetry of 1 8 nucleotides can be noticed at the end of the hagC ORF and may represent a 
potential stem-loop structure used in transcription-termination. 
5 A comparison between the hagB and hagC nucleotide sequences revealed that their ORFs 

are 99% homologous, but their upstream and downstream regions are only 39.5 and 34.6% 
homologous, respectively. It is worth noting that both genes encode a 350-amino acid protein which 
are 98.6% homologous. The Hag B protein exhibits a deduced MW of 39.4 kD and pi of 8.98. The 
hagB gene possesses two sets of -10 and -35 sequences which are similar to the consensus 
10 sequences found in £. coli. Contrary to hagC however, a ribosome-binding site can be noted 

upstream the ATG initiation codon in position 363. Furthermore, four repeats of 42 bp each that 
are found in the promoter region of hagB are missing from the hagC gene. A potential transcription- 
termination stem-loop made by a nearly perfect 17 nucleotide long dyad symmetry can also be noted 
at the end of the hagB gene. No nucleotide sequence or protein exhibiting significant homology to 
15 the hagC gene or protein was found using the data bases GenBank, EMBL, or NBRF. 

D. Characterization of the hapD gene and pene product . A fourth hemagglutinin gene, 
designated as hagD, was isolated from P. gingivalis 38 1 using standard procedures as described 
The original nucleotide sequence comprising the hagD gene is shown in SEQ ID NO. 7. The hagD 
ORF as originally determined codes for a 1087 amino acid, 1 17 kD protein with a pi of 4.5. The 
derived amino acid sequence encoded by the original hagD gene is shown in SEQ ID NO. 8. The 
nucleotide sequence for the entire hagD gene is shown as SEQ ID NO. 25. Two open reading frames 
were identified within the hagD nucleotide sequence. The first open reading frame, bases 696- 1 790, 
encodes a polypeptide shown as SEQ ID NO. 26. This polypeptide can have activity as a protease. 
The second open reading frame, bases 1790-5866, encodes a polypeptide shown as SEQ ID NO. 27. 
The second encoded polypeptide has activity as a hemagglutinin. 

The P. gingivalis 381 cells were grown at 37°C in Todd-Hewitt broth (THB) supplemented 
with 5 ^ig/ml hemin and 1 ng/ml menadione in an atmosphere of 10% H : -5% CO r 85% N,. HindHh 
restricted genomic DNA was then electrophoresed through TAE agarose gel (9%). The DNA was 
transferred to a nylon membrane by the capillary alkaline transfer method using 0.4 M NaOH-0.6 
M NaC and labeled using the nonradioactive DNA labeling and detection kit (Genius, Boehringer 
Mannheim). The membrane was prehybridized for 2 hours at 42°C in 5X SSC (0.75 M NaCl, 0.085 
M sodium citrate (pH 7.0); blocking agent 0.5% (w/v); N-lauroylsarcosine (Na-salt), 0. 1% (w/v); 
sodium dodecyl sulfate (SDS), 0.02% (w/v); formamide 50% (v/v)). 
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The EcoKL-PvuU DNA fragment from hagA was randomly primed by incorporation of 
digoxigenin-labeled dUTP. Hybridization was carried out overnight at 42° C. The membrane was 
washed twice with each of the following solutions: 2X SSC-0. 1% (w/v) SDS at room temperature 
for 5 minutes, and 0. IX SSC-0. 1% (w/v) SDS at 68 °C for 15 minutes. Detection was carried out 
5 using "LUMI-PHOS" 530 (Boehringer Mannheim), the enhancer for chemiluminescent detection 

of alkaline phosphatase, according to the manufacturer, and autoradiographed. 

A genomic bank was created using //i/idlH -digested chromosomal DNA from P. gingivalis 
381, as described above for hagC. Fragments ranging from 4.8 to 6.4 kb were cut out and the DNA 
was recovered using the phenol freeze/thaw procedure. The DNA was then ligated to the 
10 dephosphorylated HindEl restricted pUC 18 (Pharmacia) using T4 DNA ligase overnight at 16°C. 

Recombinant plasmids were transformed into E. coli DH5a (BRL) and plated on Luria- 
Bertani (LB)(10 g/1 Bacto®Tiyptonc, 5 g/1 yeast extract, 5 g/1 NaCl, 15 g/1 agar) plates 
supplemented with 50 jig/ml ampicillin Colonies were picked, transferred to nylon membranes, and 
subjected to lysis in 10% (w/v) SDS, 3 minutes; 0.5 N NaOH- 1 .5 M NaCl, 5 minutes; 1 .5 M NaCl- 
15 0.5 M Tris-Cl (pH 7.4), 5 minutes; and 2X SSC, 5 minutes. The membranes were then left to diy 

for 30 minutes and baked at 120 °C for 30 minutes. Prior to hybridization the membranes were 
washed in: 5X SSC, 0.5% SDS, 1 mM EDTA (pH 8.0) for 30 minutes at 50°C. Hybridization was 
then carried out as described above using a 1,228 bp HindLU-Smal hagA DNA fragment as a probe. 

Plasmid DNA was isolated and restriction mapping, was carried out according to procedures 
20 described. 

Double-stranded DNA sequencing was performed by the University of Florida ICBR DNA 
Seq uencing Core Laboratory. Sequencing was accomplished by employing the Taq Dye Primer and 
Taq Dye Terminator cycle sequencing protocols using the fluorescent primers and dideoxy 
nucleotides, respectively. The entire sequence was obtained for both strands of DNA using the 
25 appropriate subclones or synthetic oligonucleotides synthesized by the University of Florida DNA 

Synthesis Core Facility. The sequencing strategy was designed to sequence overlapping sites used 
in DNA subcloning. 

The complete sequence was determined using the Genetic Computer Group Sequence 
analysis software and the inverse polymerase chain reaction (IPCR) method. For the IPCR 
30 procedure, 50-500 ng of P. gingivalis genomic DNA restricted with BamHl was circularized and 

self-Iigated with T4 DNA ligase overnight at 16°C. The circularized genomic DNA was amplified 
by IPCR in a mixture containing: 1 60 mM each dNTPs, 1 .5 mM MgCl 2 , IX Buffer [ 1X=50 mM 
KC1, 10 mM Tris-HCl (pH 8.3)], 4xl0- 4 mM of the primers APF 147 (5'- 
GGAATGGGAGATGGAACT-3 ') (SEQ ID NO. 11) and APF 148 (5'- 
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GTAACCCGTATTGTCTCC-3 ') (SEQ ID NO. 12) and 5 U Taq I. The IPCR amplification was 
accomplished with the "PTC- 100" Programmable Thermal Controller (MJ Research, Inc.) for 5 
linked files as follows: (1) 30 minutes at 94°C for 1 cycle after which the Taq I was added; (2) 1 
minute at 94 °C; (3) 1 minute at 52 °C; (4) 5 minutes at 72 °C, repeat steps 2,3, and 4, 34 more 
5 times; (5) 10 minutes at 72 °C. The amplicon was gel purified and the DNA was extracted using 

agarase. The purified amplicon was sent to be sequenced using APF 147 (SEQ ID NO. 1 1) as the 
primer. 

The recombinant plasmid comprising the hagD gene in E, coli expressed four proteins 
which were subjected to SDS-PAGE electrophoresis under denaturing conditions a doublet 

10 corresponding to proteins with Mr of 90 and 85.8 kD, as well as an 80 kD and a 20 kD protein. 

Based on the intensity of the bands, the 80 kD protein appeared to be the most strongly expressed. 
A comparison between hagD and hagA amino acid sequences revealed that they possess an overall 
homology of 73.8% composed of a central region with 90% homology flanked by regions sharing 
less than 60% homology. Hag D was also found to possess high homology (89.5%) to the prtP gene 

15 product isolated from the strain P. gingivalis W12. The N-terminus region of these two proteins 

was found to be more homologous (90%) than the C-terminus (72%). It is therefore possible that 
hagD and prtP gene products represent different alleles of the same gene which evolved, from a 
common ancestral strain and diverged. Both hagA and hagD transcripts, as determined by reverse 
PCR analysis, were detectable only in hemin-replete conditions as previously reported for hagC. 

20 These results show that hagA, hagC, and hagD might be coordinately regulated by hemin while 

hagB is differentially expressed. 

E. Characterization of the hagE gene and gene product Using a repeated sequence of hagA 
as a probe, an additional fragment approximately 2.6 bp in length was detected in P. gingivalis 38 1 
genomic DNA by Southern analysis. In order to clone this fragment, a genebank was constructed 

25 from P. gingivalis strain 3 8 1 genomic DNA and screened by in situ hybridization with the probe. 

A total of 59 positive colonies were identified. Restriction enzyme digestion of mini-preparations 
of plasmid DNA from 8 positive colonies revealed that 7 of them contained the expected fragment. 
Hemagglutination assay demonstrated that the cloned fragment in one orientation conferred a high 
level of hemagglutination activity on the E. coli host strain but no activity when the fragment was 

30 in the opposite orientation. Sequencing data confirmed that the 5 ' sequence of the clone is unrelated 

to that of hagA while the 3 ' sequence of 600 bp has high homology to hagA. This homology occurs 
in the area of the 1.3 kb repeat in hagA. This discovery of yet another gene, designated hagE, with 
areas of homology to hagA, may indicate that these genes represent a multi-gene family with similar 
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functions and perhaps identical active sites. It is likely that such duplication indicates an essential 
or important function to the bacterial species and its interaction with the host. 

By constructing a gene library, an 8.64 kb fragment was obtained which, when sequenced, 
was found to contain the complete open reading frame (ORF) of hagE. This ORF is 5,064 bp in 
length and encodes a 1,687 amino and, 183.7 kD protein. The nucleotide and amino acid sequences 
for hagE are shown as SEQ ID NOS. 28 and 29, respectively. Two other ORFs were found in hagE 
between nucleotides 6580-7551 and 7716-8640, respectively. When comparing the sequence of 
hagE with that oiprtH, which encodes a C3 protease from strain W83, it was found that the whole 
3,658 bp cloned fragment of prtH was within the clone comprising hagE. The hagE fragment 
contains an additional 3,761 bp 5' and 1,327 bp 3' of the prtH fragment. The homology of the 
common sequence is 98%. However, there are also 16 gaps in comparing the two sequences, 
including one base deletion, 13 one-base, and 2 two-base additions in prtH. This is likely due to 
strain differences. However, a sequence of an additional protease gene (rpg-T) reported from another 
strain (HG66) showed only 2 gaps in this region and maintained the ORF in relation to hagE. Most 
15 interestingly, translation analysis of our cloned fragment showed there is no prtH-tikc ORF present. 

Therefore, prtH is likely not present in P. gingivalis strain 38 1 . In addition, two additional ORFs 
directly downstream of hagE were identified within the cloned fragment. The sequencing of hagE 
has revealed it to be a member of the HagA multi-gene family. 

F Characterization of the prtP gene a nd gene product. A gene and polypeptide having 
20 homologous regions to those of the hagA, hagB, hagC, hagD, and hagE genes and gene products 

were isolated bomPorphyromonas gingivalis W12. The P. gingivalis DNA insert in XFBP 1 was 
4.5 kb (pHW2) and was subcloned for sequencing. It contained a large open reading frame, which 
encodes approximately the carboxy-terminal two-thirds of the cysteine proteinase, porphypain. The 
complete gene encoding porphypain was obtained using PCR and IPCR technology. The gene, 
25 which has a nucleotide sequence as shown in SEQ ID NO. 9, is designated prtP. The deduced amino 

acid sequence of the prtP gene is shown in SEQ ID NO. 10. 

Four repeated amino acid sequences and more than five Pro-Asn tandem repeats were 
identified in the carboxy-tenninal three-fifths of the gene. Repeat 1 includes amino acid segments 
688-708 and 946-967; repeat 2 includes three amino acid segments 887-952, 1341-1405, and 1607- 
30 1650; repeat three includes amino acids 985-1006 and 1430-1451; and repeat 4 includes amino acids 

1041-(1100)and 1488-(1547). These repeats can be functionally or structurally important. For 
example, a Pro-X motif in the TonB protein has been implicated in crossing the periplasmic space. 
Based on Southern blot analyses, Repeat 2 was present in at least 20 copies in each of the seven P . 
gingivalis genomes examined. The pattern of bands observed in these analyses was very similar for 
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strains WSO and W83, but not identical; these strains have been previously indistinguishable when 
analyzed by multilocus enzyme electrophoresis, DNA fingerprinting, and arbitrarily primed PCR. 
Therefore, the repeats can be useful for distinguishing P. gingivalis strains. Strains ATCC 33277 
and 381 showed an identical banding pattern in our analysis, which supports previous analyses 
5 characterizing the relatedness of the strains and the suggestion that strain ATCC 33277 is actually 

a derivative of strain 38 1 . 

Several other P. gingivalis genes with homology to prtP have been described. Most of 
hagA, which encodes a hemagglutinin identified originally in strain 381 was highly homologous to 
the C-terminal portion of prtP, including four-and-a-half copies of a large DNA segment 

10 encompassing the prtP Repeat 2 sequence. Our data were consistent with the presence of hagA in 

the seven strains examined. Certain evidence suggests that an extracellular form of PrtP participates 
in hemagglutination, indicative of the function of the large region the proteins have in common. Five 
proteinase genes previously identified in P. gingivalis were also found to be partially homologous 
xoprtP: rgp-LprpRl t prtR,prtH, and agp. Each of these genes is thought to encode a proteinase 

1 5 with Arg-X specificity, but not Lys-X specificity, and none of them had homology to the N-tenninal 

portion of PrtP. The subject proteinases from the subject strain W12 have been demonstrated to 
degrade fibrinogen and fibronectin and hydrolyze both N-/>-tosyI-Gly-Pro-Lys-/>-nitioanilide and N- 
/>-tosyl-Gly-Pro-Arg-/>-nitroanilide. 

Genomic DNA from Porphyromonas gingivalis W12 was isolated using standard 

20 procedures, as described herein and was purified and disrupted by shearing. EcoRl linkers were 

ligated to the ends of P. gingivalis DNA fragments of appropriate sizes, and the fragments were 
cloned into the kgll 1 vector. The kgtl 1 library was screened using polyclonal antibodies raised 
against a 120-kD cysteine proteinase (porphypain), purified from P. gingivalis W 12. Several clones 
were isolated that reacted strongly with the anti -proteinase antibody. One of the clones, A.FBP1, 

25 reacted strongly with the antibody, and contained a protein which bound fibrinogen. 

The gene prtP has an open reading frame extending from bases 696 to 5894 and encodes 
a unique protein of 1732 amino acids, including a putative signal sequence for protein secretion. The 
predicted molecular mass for the mature protein was 186 kD, which is close to the observed 
molecular mass of 1 80 kD. There was one copy of prtP in the genomes of seven P. gingivalis 

30 strains examined (ATCC 33277, 381, WSO, W83, W12, HG66, and ATCC 53977). The gene is 

located 5' to a region with a high degree of homology to the insertion element IS1 126 in P. 
gingivalis strain W12. The PrtP protein had regions of high homology to Hag A, a hemagglutinin 
of P. gingivalis, and to several purported proteinases of P. gingivalis that have Arg-X specificity. 
A detailed comparison of genes encoding the latter and cpgR indicates that rgp-1, prpRU prtR, agp. 
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cpgR, and possibly prtH can be derived from identical genetic loci. Although an rgp- 1 -like locus 
was detected in seven P. gingivalis strains by Southern blot analyses, agp and cpgR were not 
detected, not even in the strains from which they were originally isolated. In addition, at least 20 
copies of a repeat region common to PrtP, the Rgp-l-like proteins, and Hag A were observed in each 
5 of the seven genomes examined. The repeat region hybridization patterns for strains W83 and W50 

were very similar, and they were identical for strains 381 and ATCC 33277, providing further 
evidence that these strains are closely related genetically. 

P- gingivalis organisms produce a number of proteolytic enzymes which are found both 
extracellularly and associated with the bacterial cell surface. Most of these P. gingivalis enzymes 

1 0 have been referred to previously as "trypsin-Iike " based on their preferential hydrolysis of proteins 

and peptides on the carboxyl side of basic amino acid residues. However, the designation is 
inappropriate because all of the enzymes that have been recent characterizations of the enzymes 
indicate they are cysteine proteinases. 

The large, cell surface-associated cysteine proteinase (porphypain; PrtP) from P. gingivalis 

15 W12 can hydrolyze synthetic peptide substrates with either arginine or lysine residues in the P, 

position. Hydrolysis of both Arg-X substrates and Lys-X substrates is activated by reducing agents 
(Cysteine » P-mercaptoethanol = DTT), and derivatives of glycine stimulate both activities. Both 
activities are inhibited by EDTA; however, hydrolysis of Arg-X substrates is inhibited by leupeptin 
and preferentially by Tyr-Pro-Arg chloromethyl ketone (YPRCK) over TLCK, and hydrolysis of 

20 Lys-X substrates is unaffected by leupeptin and preferentially inhibited by TLCK over YPRCK, 

indicating the presence of two types of active sites. The porphypain of the subject invention can 
contain two separate enzymes or a single enzyme which has one active site with two different 
conformations — one which accepts lysine in P,, and the other which accepts arginine in P, . 



25 Example 14 - Construction of DNA Probes 

DNA-DNA hybridization assays (DNA probes) are based on the fact that single-stranded 
DNA will re-anneal only with a complementary strand of DNA whose sequence is homologous. 
More recently, DNA probes have been used as a means of detecting various infectious agents and 
some are now used routinely in clinical microbiology laboratories. The identification of DNA 

30 sequences of oral Porphyromonas sp. make it possible to create DNA probes for the identification 

of these species. Therefore, one application of the identification and isolation of genomic sequences 
which encode bacterial antigens is the use of the DNA fragments as DNA probes. In the current 
case, these probes may comprise the Porphyromonas clones identified herein, or fragments of these 
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clones. Also, the DNA sequence shown in SEQ ID NOS. 1,3,5,7,9, 11, 13, 15, 17, 19,21,23, 
25, and 28, or fragments of those sequences, can be used to construct suitable probes. 

Each recombinant plasmid is isolated and digested with whichever restriction enzyme was 
used to generate that particular genomic library. The digested plasmid DNA is then separated 
5 electrophoretically on an agarose gel as described earlier. The Porphyromonas DNA band 

containing the fragment is cut out of the gel and the DNA fragment is recovered by electro-elution 
employing centrifugal filtration of DNA fragments through a Durapore (Millipore) membrane inside 
a conical tip. This rapid and simple method recovers 70% of the DNA in a highly pure state. 

The conical tip is assembled as follows: the conical portion of a 1.5 ml Eppendorf tube is 

10 cut off and a hole pierced in the bottom with a thin wire. A 4.5 cm 2 piece of Durapore (Millipore) 

membrane is wetted (d H 2 0) on a piece of parafilm, the filter square is then formed around a blunt- 
ended glass rod, and the filter is placed inside the conical bottom (cone). Excess filter is cut away, 
the filter tip is placed inside a 1.5 ml Eppendorf tube, and the filter is prewetted with 200 \il of 
elution buffer (0.1% SDS + 50 mM Tris-HCl, pH 7.5). The gel slice is then transferred to the 

15 prepared conical tip. After centrifugation of the DNA preparation in a microcentrifuge (Eppendorf) 

for 10 minutes, the filtered aqueous phase containing the DNA is precipitated by the addition of 5 
M NaCl (to 1 M) and two volumes of ethanol. After ethanol precipitation, the DNA fragment(s) is 
labeled non-radioactively, using a photo-activatable biotin tag as described by the supplier (Clontech 
Laboratories, Inc.). 

20 For biotin labelling, the DNA fragment preparation is adjusted to a concentration of 1 mg/ml 

(TE) and is mixed with photo-activatable biotin (PAB) at a ratio of 1:3 (DNA.PAB) in a 1.5 ml 
Eppendorf tube. The tube is placed in an ice bath 1 0 cm below a 275 W (GE RSM) sunlamp and 
the DNA + PAB is irradiated for 15 minutes. The DNA solution is then mixed with an equal volume 
of 0.1 M Tris-Cl (pH 9.0) and the volume adjusted to * 1 00 pi with H^. The unincorporated PAB 

25 is extracted from the DNA by the addition of an equal volume of 2-butanol, vortexing, centrifiiging 

briefly, and withdrawing the lower aqueous phase with a Pipetman. The extraction can be repeated 
to remove any traces of unbound PAB. 3 M NaOAc (pH 5.6) is added to the DNA solution to a final 
concentration of 0.3 M and the labeled DNA is precipitated by the addition of three volumes of 
ethanol. 

30 After the sample is cooled at -70 °C for 15 minutes, the precipitated DNA is recovered by 

centrifugation for 10 minutes. The DNA pellet is dissolved in 10 mM Tris (pH 7.9) and 0. 1 mM 
EDTA. The labeled probe DNA remains stable for one year if stored at -20 °C 

A non-radioactive method of labeling the DNA probes may be desirable because: ( 1 ) the 
photoactivatable reactions are simple and rapid, (2) the sensitivity is as high as 32 P-labeled probes, 
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(3) the PAB-labeled probes have a long storage life, (4) these probes are relatively inexpensive, and 
(5) detection of bound probes is by simple colorimetric methods. The radioactive labeling of probes 
requires the use of 32 P, which has a very short half-life (14 days) and is thus unstable and expensive. 
The use of radioactive probes would be limited because of cost, the dangers of radioactivity, strict 
requirements far disposal, and the need for licensing. However, if for some reason the biotin-HRP 
method of labeling is unacceptable, the DNA fragments can be labeled with [ft P] 32 deoxy-CTP by 
standard nick translation methods as described by Maniatis et ai (1982, supra). Other labelling 
techniques which are well known or accepted by ordinary skilled artisans can also be employed for 
visualization of the nucleic acid probes. 



Exam ple 15 - Determinin g; the Specificity of the DNA Probes 

The prepared DNA probes are screened for specificity against a battery of oral 
Porphyromonas species, other oral species, and other non-oral gram-negative bacteria. 

Cultures of the test strains are grown in appropriate medium to a density of approximately 
15 10 9 cells per ml. The cells are centrifuged and suspended in 5.0 ml of distilled water. Sodium 

hydroxide is added to 0.5 N and the cells are incubated at 90°C for 20 to 30 minutes in order to fyse 
the cells and denature the DNA. The cell suspension is neutralized by the addition of 0.5 N HC1 
diluted in 20x SSC and chilled cm ice far 20 minutes. A volume of 0.5 ml (or less) of the suspension 
is diluted to 4.0 ml volume with lOx SSC and vacuum filtered in a manifold onto nitrocellulose paper 
20 (type HAWP, 0.45 nm, Miilipore Corp.) which is prewetted with lOx SSC. After the filters are 

rinsed with 4.0 ml of lOx SSC, they are dried and heated at 85 °C for 3 hours in a vacuum oven (this 
fixes the chromosomal DNA onto the filter). After the filters are incubated for 2-3 hours at 42°C 
with the prebybridization buffer (6x SSPE [1.08 M NaCl, 0.06 M NaRPO,, 0.48 M NaOH, 6.0 mM 
Na2 EDTA, pH 7.0], 5x BFP [0.1% BSA, 0.1% Ficoll, and 0.1% polyvinyl pyrrolidine], 1% [w/v] 
25 glycine, 50% formamide, and 100 \xg denatured salmon sperm DNA/ml), the prehybridization buffer 

is replaced with hybridization buffer containing 0.01 to 0. 1 \ig of labeled heat-denatured probe DNA 
in 5x SSPE, lx BFP, 50% formamide, 100 *tg salmon sperm/ml, 0.3% SDS, and 10% sulfate. 
Hybridization is accomplished by incubating the DNA mixtures for 12 hours at 42°C. The filters 
are then washed twice in 2x SSPE - 0.2% SDA for 25 minutes at 60°C in order to remove any 
3 0 unhybridized probe DNA. 

The hybridized (bound) probe DNA can be detected by incubation of the filters for 30 
minutes on 1 M NaCl + 0. 1 M Tris-HCl (pH 7.5) + 2 mM MgCU + 0.05% "TRITON" X- 100 + 3% 
BSA and then for 25 minutes in 1 mg/ml streptavidin alkaline phosphate conjugate in the same 
buffer. Next, the filters are washed 3 times with 50-100 ml of buffer containing 1 M NaCl, 0.1 M 
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Tris-HCl, pH 7.5, 2 mM MgCl 2 , and 0.05% "TRITON" X-100. A fourth wash of buffer contains 
0. 1 M NaCl and 0.3 M sodium citrate, pH 7.0. The color is developed by the addition of 32 \il 
nitroblue tetrazolium, 16 \d 5-bromo-4-chloro-3-indosyI-phosphate in 5.0 ml of 0. 1 M NaCl + 0.3 
M sodium citrate. After incubation in subdued light for 30 minutes, any spots which are visible 
5 indicate hybridization of probe DN A to target DNA. 

If 32 P-labeled probes are used the same hybridization conditions can be used (adding 10 6 
CPM of probe) but instead of adding the strcptavidin conjugate, the filters are dried for 1-2 hours 
at 70 °C, and hybridization is detected by autoradiography. Alternatively, the filters can be cut into 
squares, placed into scintillation vials, and counted in scintillant. 

10 Once probes are identified which arc specific for either A intermedins or P. gingivalis, or 

several Porphyromonas spp., they can be tested with known mixtures of the test bacteria grown on 
plates as follows: various mixtures of the test bacteria can be prepared with a known concentration 
of B. intermedins or P. gingivalis and spread on agar plates and incubated anaerobically as 
described earlier in this proposal. After the colonies have appeared (2-4 days), they are blotted onto 

1 5 nitrocellulose membranes, and the membranes processed for hybridization. If the DNA probe(s) is 

specific and sensitive, then only the P. gingivalis or B. intermedins colony blots should be positive. 
It is also possible that a probe may be found that is genus or group specific. 

DNA probes for chronx>somally-encoded genes require 10 s to 1 0 6 bacteria per colony or dot 
blot in order to give a reliable positive result. This is comparable to 1 to 10 pg of DNA. Given this 

20 level of detection, a primary culturing step is desirable prior to blotting the colotftes onto membrane 

filters and hybridization with the probe DNA. 



Example 1 6 - Vaccines 

In view of the immunoprotectant activity exhibited by certain of the compositions of matter 

25 of the subject invention, vaccines may be produced from the polypeptides expressed by cells which 

have been transformed with DNA fragments from Porphyromonas gingivalis. By introducing these 
peptides, along with a pharmacologically suitable vehicle, into the human or animal host, that host 
can be induced to generate immunological protection against P. gingivalis. The preparation of such 
a vaccine composition is within the skill of one trained in the medical and immunological sciences. 

30 Cells which can be used to produce recombinant peptides include, but are not limited to, bacteria, 

yeasts, insects, and eukaryotic cells. 
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Example 1 7 - Construction of an Oral Vaccine 

It has been recognized that natural infection with enteric organisms produces the highest 
levels of antibodies and the longest lasting immunity to reinfection. The use of Salmonella as an 
attenuated vaccine carrier organism has several advantages. Salmonella spp. are capable of 
5 colonizing the Peyer's patches and gut lamina propria where they elicit a strong local IgA response 

in the intestine. The IgA response is also spread to other external secretions such as saliva by the 
seeding of these tissues with plasma cell precursors primed in the gut via the so called common 
mucosal immune system. These responses are important in preventing initial adhesion and 
colonization of mucosal surfaces - the initial step in the etiology of periodontal disease. In addition, 
10 live Salmonella elicits a humoral (serum) response of the IgM, IgG and IgA isotypes due to its 

invasive nature. Finally, infection with live organisms also stimulates a cell-mediated immune 
response— primarily T-cell mediated stimulation of macrophages— which is important in immunity 
since Salmonella can survive intracellular^ within phagocytic cells. Several non-virulent mutants 
of Salmonella spp. have been developed For example, an attenuated galE mutant of S. typhi (strain 
1 5 Ty2 1 a) which lacks the enzyme UDP-galactose-4-epimerase has been developed. 

Another approach to attenuation has been to use aromatic amino acid dependent (aro~) 
strains of Salmonella which are nonvirulent because they require metabolites not found in 
mammalian tissues, i. e. , />-aminobenzoate and 2,3-dihydroxybenzoate. The strains are constructed 
using the aro: A5 54 ::Tni0 transposon, and, because it can cause deletion or deletion-inversion 
20 mutations, one can generate non-reverting mutants. These mutants synthesize a complete smooth 

LPS, are able to effectively colonize the Peyer's patches and gut, and are highly immunogenic. In 
mice of the Salmonella-susceptible line BALB/c, intraperitoneal injection of as few as 2 x 10 s aro'S. 
ryphimurium protected against an i.p. challenge of 5 x 10 5 virulent parent cells 30 days later 
(>25,000 i.p.LDjo). Oral immunization with 2 x 10 8 aro' cells protected mice against an oral 
25 challenge of 3 x 1 0 7 virulent organisms (ca. 1 00 oral LD^). 

Because live Salmonella is such an efficient stimulator of mucosal immunity it can be used 
as a carrier to deliver recombinant gene products cloned from other pathogens directly to the tissues 
(i.e., Peyer's patches) which most efficiently generate an immune response in the gut, and through 
the common mucosal immune system, to other distant secretory sites. At the same time a humoral 
30 immune response is stimulated which may further help prevent or abort invasion. Using cloned 

antigens in a Salmonella carrier system gives one the ability to target the immune response to 
important virulence antigens leading to a protective immune response. 

Chromosomal DNA was isolated from P. gingivalis strain 38 1 by the following method: 
One to three liters of cells were pelleted by centrifugation and washed (on ice) in 1/50 volume of IX 
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SSC buffer (0.87% NaCl, 0.04% Na citrate) containing 27% sucrose and 10 mM EDTA. The cells 
were again pelleted and resuspended to 10 10 cells/ml in the same buffer. Lysozyme (5 mg/ml in IX 
SSC buffer) was added to 0.5 mg/ml, the cells were mixed thoroughly and incubated at 37°C for 10 
minutes. Nine volumes of IX SSC containing 27% sucrose, 10 mM EDTA and 1.11% SDS 
5 (prcwarmed to 39°C) were added to the cells and incubated at 37°C until cell lysis was complete 

(10-30 minutes). The lysed cells were mixed gently and incubated at 37°C for 30 minutes. 
Proteinase K (Sigma, St. Louis, MO) was added to a final concentration of 1 mg/ml and the lysate 
was incubated at 37°C for 4 hours. An equal volume of phenol-Tris (9: 1 freshly distilled phenol: 1 
M Tris-HCl, pH 7.5) was added to the Proteinase K-treated mixture and the mixture was agitated 
10 gently at room temperature for 30 minutes. The DNA mixture was then ccntrifuged in 150 ml Corex 

tubes at 3000 rpm The top (phenol) layer was removed and discarded. The phenol extraction was 
repeated and the DNA (aqueous) layer was dialyzed extensively against 10 mM Tris-HCl, pH 8.0, 
1 mM EDTA. Finally, the DNA was incubated with RNase at 37°C for 1 hour. 

Expression vectors which contain a promoter upstream from the cloning site were used to 
1 5 help insure that cloned DNA was expressed whether or not a structural gene was cloned with its own 

promoter. The expression plasmid pUC9 (2.7 kb) contains the origin of replication, ampicillin 
resistance gene, and lac gene of pBR 322. The lac HaeU fragment (lac gene) contains a polylinker 
region from M12mp9 which has multiple unique cloning sites in the gene that encodes for the 
peptide of P -galactosidase. Thus, recombinant vectors that contain an insert in any of the cloning 
20 sites generate white colonies on X-GAL plates since they are not able to degrade the lactose analog, 

X-GAL. Vectors without an insert degrade X-GAL and result in blue colonies on X-GAL plates 
since the gene is not interrupted by an insert Other plasmid vectors are available and could be used. 
One such plasmid is pAD 230. 

The chromosomal DNA and vector DNA were ligated with T4 DNA ligase at ratios of 2: 1 
25 and 5:1. The ligated DNA was phenol-chloroform (24:1 isoamyl alcohol) extracted, ethanol 

precipitated, washed, dried, and redissolved in TE. Early log-phase cells (OD=0.2 to 0.5) were 
washed with transformation buffer 1 (TFM 1, 10 mM Tris-Cl, pH 7.5, 0.15 M NaCl). The cells 
were pelleted, resuspended, and incubated on ice for 45 minutes in TFM 2 (50 mM CaCU). After 
the cells are again pelleted, they are gently resuspended once more in TFM 2. A 0.2 ml volume of 
30 cells were added to 0.1 ml TFM 3 (10 mM Tris-HCl, pH 7.5, 50 mM CaCl* 10 mM MgS0«-7H 2 0) 

on ice. Varying amounts of DNA were added to the cells. The tubes were incubated on ice for 45 
minutes, at which time the cells were heat shocked at 37°C for 2 minutes. A 0.5 ml volume of LB 
broth was added per tube and the cells were incubated at 37 °C for 30 to 60 minutes to allow 
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expression of antibiotic resistance. Finally, the cells were spread on plates of LB + antibiotic (50 

m>/ml ampicillin) and X-GAL and incubated 24 to 48 hours at 37°C. 

Any colonies which appeared on the LB + ampicillin + X-GAL plates after 24-36 hours of 

incubation were transfonnants which contained and expressed pUC9. A large number (80-90%) of 
5 these were white colonies which contain a plasmid with inserted P. gingivalis DNA. Once a 

transfonnant was identified which expressed P. gingivalis SHA adhesin, the protein was identified 

by Western blotting cell lysates of the transfonnant. 

Because the initial cloning was done in E. coli, the recombinant plasmids may be modified 

by the E. coli modification system. These modified recombinant plasmids were used to transform 
10 strains of Salmonella. Initially, recombinant plasmids were passed into Salmonella typhi murium 

strain LB 5000, which is restriction* (is not able to restrict foreign DNA) but modification*. This 

modifies the plasmid DNA according to the Salmonella system. 

Recombinant P. gingivalis plasmids encoding for the Porphyromonas (SHA) adhesin can 

be isolated and purified as described above. The identity and purity of the preparation can be 
15 monitored by restriction analysis and agarose gel electrophoresis. Cells of Salmonella strain LB 

5000 can be made competent and transformed with the recombinant plasmid as described above. 

Transfonnants can be selected by growth in the presence of ampicillin and arc tested for the 

expression of the Porphyromonas antigen also by procedures described above. 

The recombinant plasmid can be isolated from strain LB 5000 and the identity of the 
20 plasmid verified The purified plasmid can be used to transform non-reverting nonvirulent mutants 

of various Salmonella spp. These strains include (but are not limited to) 5. enteriditis 

(typhimuhum) SL 3261 (WRAY his G46 aro A), SL 1479 (UCD his C527 aro A), SL 3237 (FIRN 

rps LI 20 aro A), and S. dublin SL 3261 (his 646 aro A). Transfonnants can be screened for 

resistance to ampicillin and assayed for expression of the Porphyromonas antigen by enzyme-linked 
25 immunosorbent assay as described above. In addition, SDS-PAGE and Western blotting can be done 

to confirm the presence of the antigen in the Salmonella transfonnants. 

The P. gingivalis hemagglutinin protein was expressed in nonvirulent Salmonella 

typhimuhum strain SL3261/CL7 and tested for activity as a competitive inhibitor of 

hemagglutination. The £ typhimuhum cells were broken by sonic disruption, whole cells and debris 
30 removed by centrifiigation and the supernatant adjusted to 40% saturation with NH 4 S0 3 . The 

supernatant was collected, dialyzed, and fractionated on a CM-Sephadex column using a 50-450 mM 

NaCl gradient Fractions were evaluated via Western blot analysis for reactivity with absorbed sera 

directed against P. gingivalis. The peak fraction was found to inhibit hemagglutination of 
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erythrocytes by whole P. gingivalis cells. This same material was analyzed for the N-terminal amino 
acid sequence and found to match the sequence predicted from the cloned gene. 

The gene for the Porphyromonas antigen can also be transduced into the Salmonella carrier 
strains by P22 transduction. Transductants can be selected by growth in the presence of ampicillin 
5 and by the expression of the Porphyromonas antigen, as detected by immunoblotting using the 

monospecific or monoclonal antibody. 

Additional carrier strains can be generated from other Salmonella serotypes. These strains 
can be derived from virulent strains by the introduction of mutations such as (auxotrophic) aro A 
or gal £. In addition, the "O" antigen may be altered or mutated to a rough LPS in strains already 
10 avinilent by P, transduction. 

Example 1 8 - Monoclonal Antibodies 

Appropriate mice can be immunized with antigens of, or cells expressing antigens of, 
Porphyromonas gingivalis. The antigens used for this immunization can be those which are 

IS identified and described in the previous examples in view of their exhibited immunogenic activity. 

The techniques employed to accomplish this immunization procedure are familiar to those skilled 
in this art The spleens can then be removed from the immunized mice and the cells therefrom fused 
to SP-2 myeloma cells using polyethylene glycol. The desired hybrid cells can then be selected by 
adding hypozanthine-aminopterin-thymidine to the medium. The surviving cells can then be tested 

20 for antibody production. The testing for antibody production can be accomplished using ELISA, 

immunoblot, and/or Western blot procedures as described in the previous examples. 

The monoclonal antibodies produced by the procedure just described can be used to test for 
the presence of P, gingivalis antigens in a sample of biological fluid. The testing procedure involves 
contacting the biological fluid with a composition containing one or more of the monoclonal 

25 antibodies. If P. gingivalis antigens are present in the biological fluid, then a reaction will occur and 

this reaction can be detected and quantified by fluorescence or other means. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be suggested 
30 to persons skilled in the art and are to be included within the spirit and purview of this application 

and the scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT (S) INFORMATION: 



Applicant Name: 
Street address: 
City: 

State/Province: 
Country: 
Postal code/Zip: 
Phone number: 



University of Florida 

186 Grinter Hall 

Gainesville 

Florida 

US 

32611 

(904) 392-8929 



Fax: 



(904) 392-6600 



Applicant Name: 
Street address: 
City: 

State/Province : 
Country: 
Postal code/Zip: 
Phone number : 



University of Alabama-Birmingham Research Foundation 

1120 G Administration Bldg., 701 20th St. South 

Birmingham 

Alabama 

US 

35294 

(205) 934-9911 Fax: (205) 975-5560 



(ii) TITLE OF INVENTION: Cloned Porphyromonas gingivalis Genes 
and Probes for the Detection of Periodontal Disease 



(iii) NUMBER OF SEQUENCES: 29 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Ted W. Whitlock 

(B) STREET: 2421 N.W. 41st Street, Suite A-l 

(C) CITY: Gainesville 

(D) STATE: FL 

(E) COUNTRY: USA 

( F) ZIP: 32606 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
(B> COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 



(Vi> CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 
<B) FILING DATE: 
(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/353,485 

(B) FILING DATE: 09-DEC-1994 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/647,119 

(B) FILING DATE: 25-JAN-1991 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/241,640 

(B) FILING DATE: 08-SEP-1988 



(viii) ATTORNEY /AGENT INFORMATION: 
(A) NAME: Whitlock, Ted W. 
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(B) REGISTRATION NUMBER: 36,965 

(C) REFERENCE /DOCKET NUMBER: UF15.C3 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (904) 375-8100 

(B) TELEFAX: (904) 372-5800 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 27.. 1518 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



ATTAATCTTT AATACTTTCA AAAGGT ATG AGA AAA TTG AAT TCT TTA TTT TCG 53 

Met Arg Lys Leu Asn Ser Leu Phe Ser 
1 5 

CTC GCC GTC CTA TTA TCC CTA TTG TGT TGG GGA CAG ACG GCT GCC GCA 101 
Leu Ala Val Leu Leu Ser Leu Leu Cys Trp Gly Gin Thr Ala Ala Ala 
10 15 20 25 

CAG GGA GGG CCG AAG ACT GCT CCT TCT GTG ACG CAC CAA GCG GTG CAG 149 
Gin Gly Gly Pro Lys Thr Ala Pro Ser Val Thr His Gin Ala Val Gin 
30 35 40 

AAA GGT ATT CGA ACA TCC AAG GTT AAG GAT CTC CGA GAT CCG ATT CCT 197 
Lys Gly He Arg Thr Ser Lys Val Lys Asp Leu Arg Asp Pro He Pro 
45 50 55 

GCC GGT ATG GCA CGA ATT ATC TTG GAG GCT CAC GAT GTA TGG GAA GAC 245 
Ala Gly Met Ala Arg He lie Leu Glu Ala His Asp Val Trp Glu Asp 
60 65 70 

GGC ACA GGC TAT CAA ATG CTT TGG GAT GCA GAT CAC AAT CAG TAC GGC 293 
Gly Thr Gly Tyr Gin Met Leu Trp Asp Ala Asp His Asn Gin Tyr Gly 
75 80 85 

GCA TCC ATT CCC GAA GAA TCT TTT TGG TTT GCC AAC GGA ACG ATC CCG 341 
Ala Ser He Pro Glu Glu Ser Phe Trp Phe Ala Asn Gly Thr He Pro 
90 95 100 105 

GCC GGT CTT TAC GAT CCT TTC GAG TAT AAA GTT CCG GTC AAT GCC GAT 38 9 

Ala Gly Leu Tyr Asp Pro Phe Glu Tyr Lys Val Pro Val Asn Ala Asp 
110 115 120 

GCA TCT TTT TCT CCC ACG AAT TTC GTG CTT GAT GGA ACA GCA TCA GCC 437 
Ala Ser Phe Ser Pro Thr Asn Phe Val Leu Asp Gly Thr Ala Ser Ala 
125 130 135 

GAT ATT CCT GCC GGC ACT TAT GAC TAT GTA ATC ATT AAC CCC AAT CCT 485 
Asp He Pro Ala Gly Thr Tyr Asp Tyr Val He He Asn Pro Asn Pro 
140 145 150 
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GGC ATA ATA TAT ATA GTA GGA GAG GGT 
Gly He He Tyr He Val Gly Glu Gly 
155 160 

GTG GTA GAG GCC GGT AAG ACT TAT CAT 
Val Val Glu Ala Gly Lys Thr Tyr His 
170 175 

CCC GGC GAT GCT GCG TCC GTT GTA GTG 
Pro Gly Asp Ala Ala Ser Val Val Val 
190 
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GTC TCC AAA GGT AAC GAT TAT 533 
Val Ser Lys Gly Asn Asp Tyr 
165 

TTC ACT GTC CAA CGA CAA GGC 581 
Phe Thr Val Gin Arg Gin Gly 
180 185 

ACC GGA GAA GGT GGC AAT GAA 629 
Thr Gly Glu Gly Gly Asn Glu 
195 200 



TTC GCT CCC GTA CAG AAT CTC CAA TGG TCT GTA TCT GGG CAG ACA GTG 677 
Phe Ala Pro Val Gin Asn Leu Gin Trp Ser Val Ser Gly Gin Thr Val 
205 210 215 

ACC CTC ACT TGG CAA GCC CCC GCA TCC GAC AAA CGG ACT TAT GTG TTG 725 
Thr Leu Thr Trp Gin Ala Pro Ala Ser Asp Lys Arg Thr Tyr Val Leu 
220 225 230 

AAC GAA AGC TTC GAT ACG CAA ACG CTT CCT AAC GGC TGG ACA ATG ATC 77 3 

Asn Glu Ser Phe Asp Thr Gin Thr Leu Pro Asn Gly Trp Thr Met He 
235 240 245 

GAT GCT GAT GGT GAT GGT CAC AAT TGG CTA TCT ACA ATA AAC GTT TAC 821 
Asp Ala Asp Gly Asp Gly His Asn Trp Leu Ser Thr He Asn Val Tyr 
250 255 260 265 

AAC ACT GCT ACT CAT ACA GGT GAC GGT GCT ATG TTT AGC AAA TCA TGG 869 
Asn Thr Ala Thr His Thr Gly Asp Gly Ala Met Phe Ser Lys Ser Trp 
270 275 280 

ACT GCT AGC GGT GGT GCA AAA ATT GAT TTG AGT CCT GAC AAC TAT TTG 917 
Thr Ala Ser Gly Gly Ala Lys He Asp Leu Ser Pro Asp Asn Tyr Leu 
285 290 295 

GTA ACT CCA AAG GTT ACG GTT CCT GAG AAT GGT AAA CTT TCT TAT TGG 965 
Val Thr Pro Lys Val Thr Val Pro Glu Asn Gly Lys Leu Ser Tyr Trp 
300 305 310 

GTT TCA TCT CAA GTG CCT TGG ACT AAT GAG CAT TAT GGA GTG TTC TTG 1013 
Val ser Ser Gin Val Pro Trp Thr Asn Glu His Tyr Gly Val Phe Leu 
315 320 325 

TCC ACA ACC GGA AAC GAG GCT GCA AAC TTT ACG ATA AAG CTA CTG GAA 1061 
Ser Thr Thr Gly Asn Glu Ala Ala Asn Phe Thr He Lys Leu Leu Glu 
330 335 340 345 

GAA ACC CTC GGA TCC GAC AAA CCT GCT CCG ATG AAC TTG GTG AAG AGT 1109 
Glu Thr Leu Gly Ser Asp Lys Pro Ala Pro Met Asn Leu Val Lys Ser 
350 355 360 

GAA GGA GTA AAG CTT CCT GCA CCT TAT CAG GAA AGA ACC ATC GAT CTC 1157 
Glu Gly Val Lys Leu Pro Ala Pro Tyr Gin Glu Arg Thr He Asp Leu 
365 370 375 

TCT GCC TAT GCC GGA CAA CAG GTG TAC TTG GCA TTC CGT CAT TTC AAC 1205 
Ser Ala Tyr Ala Gly Gin Gin Val Tyr Leu Ala Phe Arg His Phe Asn 
380 385 390 
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TCT ACA GGT ATA TTC CGT CTT TAT CTT GAT GAT GTG GCT GTT TCT GGT 1253 

Ser Thr Gly lie Phe Arg Leu Tyr Leu Asp Asp Val Ala Val Ser Gly 
395 400 405 

GAA GGT TCT TCC AAC GAC TAC ACG TAG ACG GTA TAT CGT GAC AAT GTT 1301 
Glu Gly Ser Ser Asn Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Asn Val 
410 415 420 425 

GTT ATT GCC CAG AAT CTC GCG GCA ACG ACA TTC AAT CAG GAA AAT GTA 1349 
Val lie Ala Gin Asn Leu Ala Ala Thr Thr Phe Asn Gin Glu Asn Val 
430 435 440 



GCT CCC GGC CAG TAT AAC TAC TGT GTT GAA GTT AAG TAC ACA GCC GGC 1397 
Ala Pro Gly Gin Tyr Asn Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly 
445 450 455 

GTA TCT CCG AAG GTA TGT AAA GAC GTT ACG GTA GAA GGA TCC AAC GAA 1445 
Val Ser Pro Lys Val Cys Lys Asp Val Thr Val Glu Gly Ser Asn Glu 
460 465 470 

TTT GCT CAT GTA CAG AAC CTG ACC GGT AGT GCA GTA GGT CAG AAA GTA 1493 
Phe Ala His Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val 
475 480 485 

ACG CTT AAG TGG GAT GCA CCT AAT G GT AC C CCGAA TCCGAATCCC 1538 
Thr Leu Lys Trp Asp Ala Pro Asn 
490 495 



GGAACAACAA 


CACTTTCCGA 


ATCATTCGAA 


AAT GGTATTC 


CTGCCTCATG 


GAAGACGATC 


1598 


GATGCAGACG 


GTGACGGCAA 


CAATTGGACG 


ACGACCCCTC 


CTCCCGGAGG 


CACCTCTTTT 


1658 


GCAGGT CACA 


ACAGT GCAAT 


CTGTGCCTCT 


TCGGCTTCTT 


AT AT CAACTT 


TGAAGGTCCT 


1718 


CAGAACCCTG 


ATAACTATCT 


GGTTACACCG 


GAGCTATCTC 


TTCCTAACGG 


AGGAACGCTT 


1778 


ACTTT CTGGG 


TAT GT GCACA 


AGAT GCCAAT 


TAT GCATCAG 


AGCACTATGC 


CGT GTACGCA 


1838 


TCTTCTAC GG 


GTAACGACGC 


TTCCAACTTC 


GCCAACGCTT 


TGTTGGAAGA 


AGTGCTGACG 


1898 


GCCAAGACAG 


TTGTTACGGC 


AC CTGAAGC C 


ATTCGTGGCA 


CTCGTGTTCA 


GGGCACCTGG 


1958 


TATCAAAAGA 


CGGTACAGTT 


GCCTGCGGGT 


ACTAAGTATG 


TTGCTTTCCG 


TCACTTCGGC 


2018 


TGTACGGACT 


TCTTCTGGAT 


TAAC CTTGAT 


GAT GTT GAGA 


TCAAGGCCAA 


CGGCAAGCGC 


2078 


GCAGACTTCA 


C GGAAAC GTT 


CGAGTCTTCT 


ACTCATGGAG 


AGGCACCGGC 


GGAATGGACT 


2138 


ACTAT CGATG 


CCGATGGCGA 


TGGTCAGGGT 


TGGCTCTGTC 


TGTCTTCCGG 


ACAATTGGAC 


2198 


TGGCTGACAG 


CTCATGGCGG 


CACCAACGTA 


GTAGCCT CTT 


TCTCATGGAA 


TGGAATGGCT 


2258 


TTGAATCCTG 


ATAACTATCT 


CATCTCAAAG 


GATGTTACAG 


GCGCAACTAA 


GGTAAAGTAC 


2318 


TACT AT GCAG 


TCAACGACGG 


TTTTCCCGGG 


GAT CACTATG 


CGGTGATGAT 


CTCCAAGACG 


2378 


GGCACGAACG 


CCGGAGACTT 


CACGGTTGTT 


TTCGAAGAAA 


CGCCTAACGG 


AATAAATAAG 


2438 


GGCGGAGCAA 


GATTCGGTCT 


TTCCACGGAA 


GCCGATGGCG 


CCAAACCTCA 


AAGTGTATGG 


2498 


ATCGAGCGTA 


CGGTAGATTT 


GCCTGCGGGT 


ACTAAGTATG 


TTGCTTTCCG 


TCACTACAAT 


2558 
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TGCTCGGATT 
ACCCCGACCG 
ACCGAAACGA 
GT GAAGT ACA 
CAGTTCAATC 
TGGGATGCAC 
CTTTCCGAAT 
GACGGCAACA 
AGTGCGATCT 
AACTATCTGG 
TGTGCACAAG 
AACGACGCTT 
GTTACGGCAC 
GTACAGTTGC 
TTCTGGATCA 
GAAACGTTCG 
GATGGCGATG 
CAT GGC GGCA 
AACTATCTCA 
AACGACGGTT 
GGAGACTTCA 
TTCGGTCTTT 
GTAGATTTGC 
AACTACATTC 
TATAC CTACA 
TTCGAAGAAG 
GCCGGCGTAT 
GTACAGAACC 
AATGGTACCC 
TTCGAAAATG 
TGGACGACGA 
GTCTCTTCGG 



TGAACTACAT 
ATTATACCTA 
CCTTCGAAGA 
CAGCCGGCGT 
CTGTACAGAA 
CTAATGGTAC 
CATTCGAAAA 
ATTGGACGA.C 
GTGCCTCTTC 
TTACACCGGA 
ATGCCAATTA 
CCAACTTCGC 
CTGAAGCCAT 
CTGCGGGTAC 
ACCTTGATGA 
AGTCTTCTAC 
GTCAGGGTTG 
CCAACGTAGT 
TCTCAAAGGA 
TTCCCGGGGA 
CGGTTGTTTT 
CCACGGAAGC 
CTGCGGGTAC 
TTTTGGATGA 
CGGTGTATCG 
ACGGTGTAGC 
CTCCGAAAGA 
TGACCGGTAG 
CGAATCCAAA 
GTATTCCTGC 
CCCCTCCTCC 
CTT CTTATAT 



TCTTTTGGAT 
CACGGT GTAT 
AGAC GGTGTA 
ATCTCCGAAA 
CCTGACCGGT 
CCCGAATCCA 
TGGTATTCCT 
GACCCCTCCT 
GGCTT CTTAT 
GCTATCTCTT 
TGCATCAGAG 
CAACGCTTTG 
TCGTGGCACT 
TAAGTAT GTT 
TGTTGAGATC 
TCATGGAGAG 
GCTCTGTCTG 
AGCCTCTTTC 
TGTTACAGGC 
TCACTATGCG 
CGAAGAAACG 
CGATGGCGCC 
TAAGTAT GTT 
TATTCAGTTC 
TGACGGTACG 
TACGGGCAAC 
GTGTGTAAAC 
TGCAGTCGGC 
TCCAAATCCG 
CTCATGGAAG 
CGGAGGCACC 
CAACTTTGAA 



GATATT CAGT TCACCATGGG 
CGTGACGGTA CGAAGATCAA 
GCTAC GGGCA ACCAT GAGTA 
GAGTGT GTAA ACGTAACTGT 
AGTGCAGTCG GCCAGAAAGT 
AAT CCAAATC CGAATCCGGG 
GCCTCATGGA AGAC GAT CGA 
CCCGGAGGCA CCTCTTTTGC 
ATCAACTTTG AAGGCCCTCA 
CCTAACGGAG GAACGCTTAC 
CACTATGCCG TGTATGCATC 
TTGGAAGAAG TGCTGACGGC 
CGTGTTCAGG GCACCTGGTA 
GCTTTCCGTC ACTTCGGCTG 
AAGGCCAACG GCAAGCGCGC 
GCACCGGCGG AAT GGACTAC 
TCTTCCGGAC AATTGGGCTG 
TCATGGAATG GAATGGCTTT 
GCAACTAAGG TAAAGTACTA 
GTGATGTTCT CCAAGACGGG 
CCTAACGGAA TAAATAAGGG 
AAACCTCAAA GTGTATGGTT 
GCTTTCCGTC ACTACAATTG 
AC CAT GGGT G GCAGCCCCAC 
AAGAT CAAGG AAGGTCTGAC 
CAT GAGTATT GCGTGGAAGT 
GTAACT GTTG ATCCTGTGCA 
CAGAAAGTAA CGCTTAAGTG 
AAT CCGGGAA CAACAACACT 
ACGATCGATG CAGACGGTGA 
TCTTTTGCAG GTCACAACAG 
GGCCCTCAGA ACC CTGATAA 



TGGCAGCCCC 
GGAAGGT CT G 
TTGCGTGGAA 
TGATCCTGTG 
AACGCTTAAG 
AACAACAACA 
TGCAGACGGT 
AGGT CACAAC 
GAACCCTGAT 
TTTCT GGGTA 
TTCTACGGGT 
CAAGACAGTT 
TCAAAAGACG 
TACGGACTTC 
AGACTTCACG 
TATCGATGCC 
GCTGACAGCT 
GAATCCTGAT 
CTATGCAGTC 
CACGAACGCC 
CGGAGCAAGA 
CGAGCGTACG 
CTCGGATTTG 
CCCGACCGAT 
CGAAACGACC 
GAAGTACACA 
GTTCAATCCT 
GGAT GCACCT 
TTCCGAATCA 
CGGCAACAAT 
TGCGATCTGT 
CTATCTGGTT 



2618 

2678 

2738 

2798 

2858 

2918 

2978 

3038 

3098 

3158 

3218 

3278 

3338 

3398 

3456 

3518 

3578 

3638 

3698 

3758 

3816 

3878 

3938 

3996 

4058 

4118 

4178 

4236 

4298 

4358 

4418 

4478 
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ACACCGGAGC TATCTCTTCC TGGCGGATTA AT 45X0 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 97 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Arg Lys Leu Asn Ser Leu Phe Ser Leu Ala Val Leu Leu Ser Leu 
1 5 10 15 

Leu Cys Trp Gly Gin Thr Ala Ala Ala Gin Gly Gly Pro Lys Thr Ala 
20 25 30 

Pro Ser Val Thr His Gin Ala Val Gin Lys Gly lie Arg Thr Ser Lys 
35 40 45 

Val Lys Asp Leu Arg Asp Pro lie Pro Ala Gly Met Ala Arg lie lie 
50 55 60 

Leu Glu Ala His Asp Val Trp Glu Asp Gly Thr Gly Tyr Gin Met Leu 
65 70 75 80 

Trp Asp Ala Asp His Asn Gin Tyr Gly Ala Ser lie Pro Glu Glu Ser 
85 90 95 

Phe Trp Phe Ala Asn Gly Thr lie Pro Ala Gly Leu Tyr Asp Pro Phe 
100 105 110 

Glu Tyr Lys Val Pro Val Asn Ala Asp Ala Ser Phe Ser Pro Thr Asn 
115 120 125 

Phe Val Leu Asp Gly Thr Ala Ser Ala Asp lie Pro Ala Gly Thr Tyr 
130 135 140 

Asp Tyr Val lie lie Asn Pro Asn Pro Gly lie lie Tyr lie Val Gly 
145 150 155 160 

Glu Gly Val Ser Lys Gly Asn Asp Tyr Val Val. Glu Ala Gly Lys Thr 
165 170 .175 

Tyr His Phe Thr Val Gin Arg Gin Gly Pro Gly Asp Ala Ala Ser Val 
180 185 190 

Val Val Thr Gly Glu Gly Gly Asn Glu Phe Ala Pro Val Gin Asn Leu 
195 200 205 

Gin Trp Ser Val Ser Gly Gin Thr Val Thr Leu Thr Trp Gin Ala Pro 
210 215 220 

Ala Ser Asp Lys Arg Thr Tyr Val Leu Asn Glu Ser Phe Asp Thr Gin 
225 230 235 240 

Thr Leu Pro Asn Gly Trp Thr Met lie Asp Ala Asp Gly Asp Gly His 
245 250 255 
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Asn Trp Leu Ser Thr lie Asn Val Tyr Asn Thr Ala Thr His Thr Gly 

260 265 270 

Asp Gly Ala Met Phe Ser Lys Ser Trp Thr Ala Ser Gly Gly Ala Lys 
275 280 285 

lie Asp Leu Ser Pro Asp Asn Tyr Leu Val Thr Pro Lys Val Thr Val 
290 295 300 

Pro Glu Asn Gly Lys Leu Ser Tyr Trp Val Ser Ser Gin Val Pro Trp 
305 310 315 320 

Thr Asn Glu His Tyr Gly Val Phe Leu Ser Thr Thr Gly Asn Glu Ala 
325 330 335 

Ala Asn Phe Thr lie Lys Leu Leu Glu Glu Thr Leu Gly Ser Asp Lys 
340 345 350 

Pro Ala Pro Met Asn Leu Val Lys Ser Glu Gly Val Lys Leu Pro Ala 
355 360 365 

Pro Tyr Gin Glu Arg Thr lie Asp Leu Ser Ala Tyr Ala Gly Gin Gin 
370 375 380 

Val Tyr Leu Ala Phe Arg His Phe Asn Ser Thr Gly lie Phe Arg Leu 
385 390 395 400 

Tyr Leu Asp Asp Val Ala Val Ser Gly Glu Gly Ser Ser Asn Asp Tyr 
405 410 415 

Thr Tyr Thr Val Tyr Arg Asp Asn Val Val lie Ala Gin Asn Leu Ala 
420 425 430 

Ala Thr Thr Phe Asn Gin Glu Asn Val Ala Pro Gly Gin Tyr Asn Tyr 
435 440 445 

Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Lys 
450 455 460 

Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala His Val Gin Asn Leu 
465 470 475 - 480 

Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro 
485 490 495 

Asn 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Porphyromonas gingivalis 

(B) STRAIN: FDC381 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: genomic 

(B) CLONE: ST7 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 310 ,.1359 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTTCTTGCT CCCTGCACGA TGTAGGAAGC CGTTGTCACG T GACAAT CAC TCCGTGCATG 60 

ATGCAGGAAG CCGTTGTCAC GT GACAAT CA CTCCGTGCAC GAT GCAGGAA GCTGTCGTCA 120 

C GT GACAAT C ACGTCCTGCA CGATGCAGGA AACGATTGTC AGC C GACAAT CGTTTCGCGC 180 

ACGGCTGTTT TGACCTTTCG TCGCCTGACA ATGCTTATAT AAAAGCT GTT TCAGGGGGCA 240 

GTGTCACTTG ACACT GCTAC CAATAACAGA TTAATAATCA AT CAAAT ACA ACAAAAAAAG 300 

GAAAAACAA ATG ACT GTA GAA AAT TTG CGT CTG CAG CGG CTC CAA AAT 34 8 

Met Thr Val Glu Asn Leu Arg Leu Gin Arg Leu Gin Asn 
15 10 

TTG GAG CAC TAC CGT TTT GCC AAG AAT GTG CTG ACG CTC TGT CGC ACG 396 
Leu Glu His Tyr Arg Phe Ala Lys Asn Val Leu Thr Leu Cys Arg Thr 
15 20 25 

GCA AAT ATC GCT AAA CTG AAT CCC AAA CTG CCC GAG CTG GAA AAG GCT 444 
Ala Asn lie Ala Lys Leu Asn Pro Lys Leu Pro Glu Leu Glu Lys Ala 
30 35 40 45 

ATC GAA ATG GAG GAT TTG GCT CTG AAT CCG CCC GTC GCG AAC GAG CTG 492 
lie Glu Met Glu Asp Leu Ala Leu Asn Pro Pro Val Ala Asn Glu Leu 
5 ° 55 60 

ACG CCT CAG GTC ATA GCC CTC GAC GAG GAA CGC GAC AGA GCC TAT CAG 540 
Thr Pro Gin Val He Ala Leu Asp Glu Glu Arg Asp Arg Ala Tyr Gin 
65 70 75 

GCG CTG ATG TCG CGC GTG CGT TCG TAT GCT TTC GAC GAG GAC AGC CAG 588 
Ala Leu Met Ser Arg Val Arg Ser Tyr Ala Phe Asp Glu Asp Ser Gin 
80 85 90 

CTG CGC AAC GCG GCA GCC AGA ATC GAA GAC GTG GCC GCT CGC TAC GGC 636 
Leu Arg Asn Ala Ala Ala Arg He Glu Asp Val Ala Ala Arg Tyr Gly 
95 100 105 

AAC GTG ATC CGA ATG AAC TAT GAC AAG GAG ACG GCC GCG ATA GAG AAT 684 
Asn Val He Arg Met Asn Tyr Asp Lys Glu Thr Ala Ala He Glu Asn 
110 115 120 125 

TTC CTC ACC GAT CTC AAG GGC GAG AAC ATT CGC CCC CTC GTA ACG AAA 732 
Phe Leu Thr Asp Leu Lys Gly Glu Asn He Arg Pro Leu Val Thr Lys 
13 ° 135 140 
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CTC GGC.GTG ACG GCA CTC GTT GAC AGA CTG GAA AAG AAC AAT AAG GCC 7 80 

Leu Gly Val Thr Ala Leu Val Asp Arg Leu Glu Lys Asn Asn Lys Ala 
145 150 155 

TTC GCC GAC TTC TTC CTC CGC CGT CTG AGC ACC GAC CAA CGA GGC AAA 828 
Phe Ala Asp Phe Phe Leu Arg Arg Leu Ser Thr Asp Gin Arg Gly Lys 
160 165 170 



TAT GAC GTG AAG GCA CTC CGT GCC GAG ACC GAC CGC ACA TTG GTA GCC 876 
Tyr Asp Val Lys Ala Leu Arg Ala Glu Thr Asp Arg Thr Leu Val Ala 
175 180 185 

GTG GTG CGC CGC ATG GAC TCC ATC GAC GAC ATG GAG CCG AGC CCG GAG 924 
Val Val Arg Arg Met Asp Ser He Asp Asp Met Glu Pro Ser Pro Glu 
190 195 200 205 

ATC CGT GCG CTC ATC GAG CTC TAC AAC CGA CTC GTG GCC AAT CGC CGC 972 
He Arg Ala Leu He Glu Leu Tyr Asn Arg Leu Val Ala Asn Arg Arg 
210 215 220 

GCT CTC TTG GCT CGT CGC GCC AGC TAC GGA GAA GCA GCC GTG GAG AAG 1020 
Ala Leu Leu Ala Arg Arg Ala Ser Tyr Gly Glu Ala Ala Val Glu Lys 
225 230 235 

CGT CGT GCC GAG ATC GCC GAG ATG CTC CGC CCC CTG CTC GCC CGG ATC 1068 
Arg Arg Ala Glu He Ala Glu Met Leu Arg Pro Leu Leu Ala Arg He 
240 245 250 

GTG GAG GAG AAG AAG ACG GCC GTC TTT GCC GGT CGC ACC CTC GGC ACG 1116 
Val Glu Glu Lys Lys Thr Ala Val Phe Ala Gly Arg Thr Leu Gly Thr 
255 260 265 

GGC AAG AAC CGC CAC TAT CTC ATC ACA TTC GTA GCC GAG AAC GGC GAC 1164 
Gly Lys Asn Arg His Tyr Leu He Thr Phe Val Ala Glu Asn Glv Asd 
270 275 280 2 85 

GAG GAG GAT CGC TGG TAC CGC ATC AAC GGG GAG CAA CTC GTC TAT GTG 1212 
Glu Glu Asp Arg Trp Tyr Arg He Asn Gly Glu Gin Leu Val Tyr Val 
290 295 300 

CCC GAA GAC GAA CTC CCC AAG CCG AAG AAA AAG AAG AAA CCC GCA AGC 1260 
Pro Glu Asp Glu Leu Pro Lys Pro Lys Lys Lys Lys Lys Pro Ala Ser 
305 310 315 

AGC ACG GAC ACT CCA TCC GAG CCG CCC GTC CTG CCG GAT CCA TCG CAA 
Ser Thr Asp Thr Pro Ser Glu Pro Pro Val Leu Pro Asp Pro Ser Gin 
320 325 330 

GGA GGC AGC AGT AGC GGC GGT GGC GAG CAA GGC TCT ACC GGC GGC GGA 1356 
Gly Gly Ser Ser Ser Gly Gly Gly Glu Gin Gly Ser Thr Gly Gly Glv 
335 340 345 

CTC TGATCCCCCC GTGCCGTCCT GCCGGCCGCA GCAGCACAGG CAACCGAGTA 14 no 

Leu 

350 



TAAAAGACAA AGGGGCTGTG ACCAAATTCA TTTTT GGCAC AGCCCCTTGT ATATTC GAAA 
A 



1308 



1469 
1470 
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(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met Thr Val Glu Asn Leu Arg Leu Gin Arg Leu Gin Asn Leu Glu His 

Tyr Arg Phe Ala Lys Asn Val Leu Thr Leu Cys Arg Thr Ala Asn He 
20 25 30 

Ala Lys Leu Asn Pro Lys Leu Pro Glu Leu Glu Lys Ala He Glu Met 
35 40 45 

Glu Asp Leu Ala Leu Asn Pro Pro Val Ala Asn Glu Leu Thr Pro Gin 
30 55 60 

Val He Ala Leu Asp Glu Glu Arg Asp Arg Ala Tyr Gin Ala Leu Met 
65 70 75 80 

Ser Arg Val Arg Ser Tyr Ala Phe Asp Glu Asp Ser Gin Leu Arg Asn 
85 90 95 

Ala Ala Ala Arg He Glu Asp Val Ala Ala Arg Tyr Gly Asn Val He 
100 105 no 

Arg Met Asn Tyr Asp Lys Glu Thr Ala Ala He Glu Asn Phe Leu Thr 
115 120 125 

Asp Leu Lys Gly Glu Asn He Arg Pro Leu Val Thr Lys Leu Gly Val 
130 135 140 

Thr Ala Leu Val Asp Arg Leu Glu Lys Asn Asn Lys Ala Phe Ala Asp 

150 155 160 

Phe Phe Leu Arg Arg Leu Ser Thr Asp Gin Arg Gly Lys Tyr Asp Val 
165 170 175 

Lys Ala Leu Arg Ala Glu Thr Asp Arg Thr Leu Val Ala Val Val Arg 
180 185 190 

Arg Met Asp Ser He Asp Asp Met Glu Pro Ser Pro Glu He Arg Ala 
195 200 205 

Leu lie Glu Leu Tyr Asn Arg Leu Val Ala Asn Arg Arg Ala Leu Leu 



220 



Ala Arg Arg Ala Ser Tyr Gly Glu Ala Ala Val Glu Lys Arg Arg Ala 

230 235 240 

Glu He Ala Glu Met Leu Arg Pro Leu Leu Ala Arg He Val Glu Glu 
245 250 255 

Lys Lys Thr Ala Val Phe Ala Gly Arg Thr Leu Gly Thr Gly Lys Asn 
260 265 270 
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Arg His Tyr Leu 
275 

Arg Trp Tyr Arg 
290 

Glu Leu Pro Lys 
305 

Thr Pro Ser Glu 



Ser Ser Gly Gly 
340 



lie Thr Phe Val 
280 



lie Asn Gly Glu 
295 

Pro Lys Lys Lys 
310 

Pro Pro Val Leu 
325 

Gly Glu Gin Gly 



56 



Ala Glu Asn Gly 



Gin Leu Val Tyr 
300 

Lys Lys Pro Ala 
315 



Pro Asp Pro Ser 
330 

Ser Thr Gly Gly 
345 



Asp Glu Glu Asp 
285 

Val Pro Glu Asp 



Ser Ser Thr Asp 
320 

Gin Gly Gly Ser 
335 



Gly Leu 
350 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1841 base pairs 
{ B) TYPE: nucleic acid 
( c ) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 374.. 1424 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

AAGCTTGCAC CT AC GACAAA AGATTTTTTC AT CT TACT AT ATTTT GGGAT TATATTTCTA 60 

CACCTCCTTA TCCGGAATTT GGAAATGCGG GGCAAAAGTA GAAAAATTTT ATTTCCATCA 120 

AAAAAAATCT TCAAATTTTT TTCACTTTGC GCATTCTGCA TATAAATGCT GCTACGTCGG 180 

CAGATTATTC TGGTTAAAAA GTTATAGATG CAGCTCTTGG TTATAGTGTC CTAAGAT CGC 24 0 

TATGCAACCT GTAAGAAACG ATT GTAGGGT GTTTCTTGCT TCCTGCACGA AT GCAGGAGA 300 

GCAGAAACGC CCGTTGCTGC TCCCGTCAAT ACACTAATTA TTATCGACTT AACCCCTTAA 360 

TTCAAAAACT AAA ATG ACT GCA GAA ATT TTC TCG TTT TCC CGG CTC CAA 409 
Met Thr Ala Glu lie Phe Ser Phe Ser Arg Leu Gin 
15 10 

AAT TTG GAG CAC TAC CGT TTT GCC AAG AAT GTG CTG ACG CTC TGT CGC 457 
Asn Leu Glu His Tyr Arg Phe Ala Lys Asn Val Leu Thr Leu Cys Arg 
15 20 25 

ACG GCA AAT ATC GCT AAA CTG AAT CCC AAA CTG CCC GAG CTG GAA AAG 505 
Thr Ala Asn lie Ala Lys Leu Asn Pro Lys Leu Pro Glu Leu Glu Lys 
30 35 40 

GCT ATC GAA ATG GAG GAT TTG GCT CTG AAT CCG CCC GTC GCG AAC GAG 553 
Ala lie Glu Met Glu Asp Leu Ala Leu Asn Pro Pro Val Ala Asn Glu 
45 50 55 60 
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CTG ACG CCT CAG GTC ATA GCC CTC GAC GAG GAA CGC GAC AGA GCC TAT 601 
Leu Thr Pro Gin Val lie Ala Leu Asp Glu Glu Arg Asp Arg Ala Tyr 
65 70 75 

CAG GCG CTG ATG TCG CGC GTG CGT TCG TAT GCT TTC GAC GAG GAC AGC 64 9 

Gin Ala Leu Met Ser Arg Val Arg Ser Tyr Ala Phe Asp Glu Asp Ser 
80 85 90 



CAG CTG CGC AAC GCG GCA GCC AGA ATC GAA GAC GTG GCC GCT CGC TAC 697 
Gin Leu Arg Asn Ala Ala Ala Arg lie Glu Asp Val Ala Ala Arg Tyr 
95 100 105 

GGC AAC GTG ATC CGA ATG AAC TAT GAC AAG GAG ACG GCC GCG ATA GAG 745 
Gly Asn Val lie Arg Met Asn Tyr Asp Lys Glu Thr Ala Ala lie Glu 
110 115 120 

AAT TTC CTC ACC GAT CTC AAG GGC GAG AAC ATT CGC CCC CTC GTA ACG 7 93 

Asn Phe Leu Thr Asp Leu Lys Gly Glu Asn lie Arg Pro Leu Val Thr 
125 130 135 140 

AAA CTC GGC GTG ACG GCA CTC GTT GAC AGA CTG GAA AAG AAC AAT AAG 841 
Lys Leu Gly Val Thr Ala Leu Val Asp Arg Leu Glu Lys Asn Asn Lys 
145 150 155 

GCC TTC GCC GAC TTC TTC CTC CGC CGT CTG AGC ACC GAC CAA CGA GGC 889 
Ala Phe Ala Asp Phe Phe Leu Arg Arg Leu Ser Thr Asp Gin Arg Gly 
160 165 170 

AAA TAT GAC GTG AAG GCA CTC CGT GCC GAG ACC GAC CGC ACA TTG GTA 937 
Lys Tyr Asp Val Lys Ala Leu Arg Ala Glu Thr Asp Arg Thr Leu Val 
175 180 185 

GCC GTG GTG CGC CGC ATG GAC TCC ATC GAC GAC ATG GAG CCG AGC CCG 985 
Ala Val Val Arg Arg Met Asp Ser lie Asp Asp Met Glu Pro Ser Pro 
190 195 200 

GAG ATC CGT GCG CTC ATC GAG CTC TAC AAC CGA CTC GTG GCC AAT CGC 1033 
Glu lie Arg Ala Leu lie Glu Leu Tyr Asn Arg Leu Val Ala Asn Arg 
205 210 215 220 

CGC GCT CTC TTG GCT CGT CGC GCC AGC TAC GGA GAA GCA GCC GTG GAG 1081 
Arg Ala Leu Leu Ala Arg Arg Ala Ser Tyr Gly Glu Ala Ala Val Glu 
225 230 235 

AAG CGT CGT GCC GAG ATC GCC GAG ATG CTC CGC CCC CTG CTC GCC CGG 1129 
Lys Arg Arg Ala Glu lie Ala Glu Met Leu Arg Pro Leu Leu Ala Arg 
240 245 250 

ATC GTG GAG GAG AAG AAG ACG GCC GTC TTT GCC GGT CGC ACC CTC GGC 1177 
He Val Glu Glu Lys Lys Thr Ala Val Phe Ala Gly Arg Thr Leu Gly 
255 260 265 

ACG GGC AAG AAC CGC CAC TAT CTC ATC ACA TTC GTA GCC GAG AAC GGC 1225 
Thr Gly Lys Asn Arg His Tyr Leu He Thr Phe Val Ala Glu Asn Gly 
270 275 280 

GAC GAG GAG GAT CGC TGG TAC CGC ATC AAC GGG GAG CAA CTC GTC TAT 1273 
Asp Glu Glu Asp Arg Trp Tyr Arg He Asn Gly Glu Gin Leu Val Tyr 
235 290 295 300 
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GTG CCC GAA GAC GAA CTC CCC AAG CCG AAG AAA AAG AAG AAA CCC GCA 1321 

Val Pro Glu Asp Glu Leu Pro Lys Pro Lys Lys Lys Lys Lys Pro Ala 

305 310 315 

AGC AGC ACG GAC ACT CCA TCC GAG CCG CCC GTC CTG CCG GAT CCA TCG 1369 
Ser Ser Thr Asp Thr Pro Ser Glu Pro Pro Val Leu Pro Asp Pro Ser 
320 325 330 

CAA GGA GGC AGC AGT AGC GGC GGT GGC GAG CAA GGC TCT ACC GGC GGC 1417 
Gin Gly Gly Ser Ser Ser Gly Gly Gly Glu Gin Gly Ser Thr Gly Gly 
335 340 345 

GGA CTC T GATCCGCACT CCCCCGTGCC GTCCTGTCGG CCGCAGCAGC ACAGGCAACC 1474 
Gly Leu 
350 

GAGTATAAAA GACAAAGGGG CTGTGACCAA ATT CATTTTT GGCACAGCCC CTTTCAGGTG 1534 

CAT AAGAAT C TATATTACGG GAGAACAATC CCT GTAAGAG CAGTCACGAT GCCGTTTTCC 1594 

TCATATACAG TAATCCGGAA GACGTCTTCC AGCAGATCGG GATGTCTCAG AACCCATGCT 1654 

C CTTTTAT GG GCTGGGGTTT TGGTTTGGCT CTGTAAATTT TTCCAAGGGA TCTAGTTTTT 1714 

AGCTCTCAAT GGGCCAGATC CCCCCTCAAG TGCAATTCGA GAGAGGATAA AAGGGATAAT 1774 

CCGTGAACGC TCTGCGGTCT AT CGGTAGCG TACGGTCATG AACAGGTGTG TACGTGCCTG 1834 

TCCGCGG 1841 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 amino acids 
<B) TYPE: amino acid 
<D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Met Thr Ala Glu lie Phe Ser Phe Ser Arg Leu Gin Asn Leu Glu His 
1 5 io 15 

Tyr Arg Phe Ala Lys Asn Val Leu Thr Leu Cys Arg Thr Ala Asn He 
20 25 30 

Ala Lys Leu Asn Pro Lys Leu Pro Glu Leu Glu Lys Ala He Glu Met 
35 40 45 

Glu Asp Leu Ala Leu Asn Pro Pro Val Ala Asn Glu Leu Thr Pro Gin 
5° 55 60 

Val He Ala Leu Asp Glu Glu Arg Asp Arg Ala Tyr Gin Ala Leu Met 
65 7 ° 75 so 

Ser Arg Val Arg Ser Tyr Ala Phe Asp Glu Asp Ser Gin Leu Arg Asn 
85 90 95 

Ala Ala Ala Arg He Glu Asp Val Ala Ala Arg Tyr Gly Asn Val He 

1°0 105 no 
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Arg Met Asn Tyr Asp Lys Glu Thr Ala Ala lie Glu Asn Phe Leu Thr 
115 120 125 

Asp Leu Lys Gly Glu Asn He Arg Pro Leu Val Thr Lys Leu Gly Val 
130 135 140 

Thr Ala Leu Val Asp Arg Leu Glu Lys Asn Asn Lys Ala Phe Ala Asp 
145 150 155 160 

Phe Phe Leu Arg Arg Leu Ser Thr Asp Gin Arg Gly Lys Tyr Asp Val 
165 170 175 . 

Lys Ala Leu Arg Ala Glu Thr Asp Arg Thr Leu Val Ala Val Val Arg 
180 185 190 

Arg Met Asp Ser He Asp Asp Met Glu Pro Ser Pro Glu He Arg Ala 
195 200 205 

Leu He Glu Leu Tyr Asn Arg Leu Val Ala Asn Arg Arg Ala Leu Leu 
210 215 220 

Ala Arg Arg Ala Ser Tyr Gly Glu Ala Ala Val Glu Lys Arg Arg Ala 
225 230 235 240 

Glu He Ala Glu Met Leu Arg Pro Leu Leu Ala Arg He Val Glu Glu 
245 250 255 

Lys Lys Thr Ala Val Phe Ala Gly Arg Thr Leu Gly Thr Gly Lys Asn 
260 265 270 

Arg His Tyr Leu He Thr Phe Val Ala Glu Asn Gly Asp Glu Glu Asp 
275 280 265 

Arg Trp Tyr Arg He Asn Gly Glu Gin Leu Val Tyr Val Pro Glu Asp 
290 295 300 

Glu Leu Pro Lys Pro Lys Lys Lys Lys Lys Pro Ala Ser Ser Thr Asp 
305 310 315 320 

Thr Pro Ser Glu Pro Pro Val Leu Pro Asp Pro Ser Gin Gly Gly Ser 
325 330 335 

Ser Ser Gly Gly Gly Glu Gin Gly Ser Thr Gly Gly Gly Leu 
340 345 350 



(2) INFORMATION FOR SEQ ID NO : 7 : 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4080 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 87.. 3347 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TCAAGAATCA GGCCTTCTTA ATAACCAATT CAGGCCTTCC TCCGGGTTCT TAC CGTAAAC 60 

TAATTTACTA AAAGTTGGAG TTTTGT ATG GGA ACA GTT GTT GCT GAT CCC ACC 113 

Met Gly Thr Val Val Ala Asp Pro Thr 
1 5 

GTT GCT GCG CCT GTG AAA ATG GCT AAA CAG ATA GCC GAA AAT GGT AAT 161 
Val Ala Ala Pro Val Lys Met Ala Lys Gin He Ala Glu Asn Gly Asn 
10 15 20 25 

TAT GAT GTA GTG ATG ACT CGC TCT AAC TAT CTT CCT GTG ATC AAC CAA 209 
Tyr Asp Val Val Met Thr Arg Ser Asn Tyr Leu Pro Val He Asn Gin 
30 35 40 

ATT CAG GCA GGA GAG CCT AGC CCC TAC CAG CCT GTT AAC AAC TTG ACT 257 
lie Gin Ala Gly Glu Pro Ser Pro Tyr Gin Pro Val Asn Asn Leu Thr 
45 50 55 

GCT CCA CCG GAG GGT GAG GAA GTG GCG CTC AAG TGG GAT ACC CCG AGC 305 
Ala Pro Pro Glu Gly Glu Glu Val Ala Leu Lys Trp Asp Thr Pro Ser 
60 65 70 

GCA AAG AAG GCA GAA GCT TCC CGT GAA GTA AAA CGG ATC GGA GAC GGT 353 
Ala Lys Lys Ala Glu Ala Ser Arg Glu Val Lys Arg He Gly Asp Gly 
75 80 85 

CTT TTC GTT ACG ATC GAA CCT GCA AAC GAT GTA CGT GCC AAC GAA GCC 401 
Leu Phe Val Thr He Glu Pro Ala Asn Asp Val Arg Ala Asn Glu Ala 
90 95 100 105 

AAG GTT GTG CTC GCA GCA GAC AAC GTA TGG GGA GAC AAT ACG GGT TAC 449 
Lys Val Val Leu Ala Ala Asp Asn Val Trp Gly Asp Asn Thr Gly Tyr 
110 115 120 

CAG TTC TTG TTG GAT GCC GAT CAC AAT ACA TTC GGA AGT GTC ATT CCG 497 
Gin Phe Leu Leu Asp Ala Asp His Asn Thr Phe Gly Ser Val He Pro 
125 130 135 

GCA ACC GGT CCT CTC TTT ACC GGA ACA GCT TCT TCC AAT CTT TAC AGT 545 
Ala Thr Gly Pro Leu Phe Thr Gly Thr Ala Ser Ser Asn Leu Tyr Ser 
140 145 150 

GCG AAC TTC GAG TAT TTG ATC CCG GCC AAT GCC GAT CCT GTT GTT ACT 593 
Ala Asn Phe Glu Tyr Leu He Pro Ala Asn Ala Asp Pro Val Val Thr 
155 160 165 

ACA CAG AAT ATT ATC GTT ACA GGA CAG GGT GAA GTT GTA ATC CCC GGT 641 
Thr Gin Asn He He Val Thr Gly Gin Gly Glu Val Val He Pro Gly 
170 175 180 185 

GGT GTT TAC GAC TAT TGC ATT ACG AAC CCG GAA CCT GCA TCC GGA AAG 689 
Gly Val Tyr Asp Tyr Cys He Thr Asn Pro Glu Pro Ala Ser Gly Lys 
190 195 200 

ATG TGG ATC GCA GGA GAT GGA GAC AAC CAG CCT GCA CGT TAT GAC GAT 7 37 

Met Trp He Ala Gly Asp Gly Asp Asn Gin Pro Ala Arg Tyr Asp Asp 
205 210 215 



BNSDOCID; <WO. 



9617936A2_I_> 



WO 96/17936 



PCT/US95/16108 



61 

TTC ACA TTC GAA GCA GGC AAG AAG TAC ACC TTC ACG ATG CGT CGC GCC 785 
Phe Thr Phe Glu Ala Gly Lys Lys Tyr Thr Phe Thr Met Arg Arq Ala 
220 225 230 

GGA ATG GGA GAT GGA ACT GAT ATG GAA GTC GAA GAC GAT TCA CCT GCA 833 
Gly Met Gly Asp Gly Thr Asp Met Glu Val Glu Asp Asp Ser Pro Ala 
235 240 245 

AGC TAT ACC TAT ACA GTC TAT CGT GAC GGC ACG AAG ATC AAG GAA GGT 8B1 
Ser Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Glv 
250 255 260 2 65 

CTG ACG GCT ACG ACA TTC GAA GAA GAC GGT GTA GCT GCA GGC AAT CAT 929 
Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val Ala Ala Gly Asn His 
270 275 280 

GAG TAT TGC GTG GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA 977 
Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val 
285 290 295 

TGT AAA GAC GTT ACG GTA GAA GGA TCC AAT GAA -TTT GCT CCT GTA CAG 1025 
Cys Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin 
300 305 310 

AAC CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTT AAG TGG GAT 1073 
Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp 
315 320 325 

GCA CCT AAT GGT ACC CCA AAT CCG AAT CCG AAT CCG AAT CCG GGA ACA 1121 
Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Glv Thr 
330 335 340 345 

ACA ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG 1169 
Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser Trp Lvs 
350 355 360 

ACG ATC GAT GCA GAC GGT GAC GGG CAT GGC TGG AAA CCT GGA AAT GCT 1217 
Thr lie Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly Asn Ala 
365 370 375 

CCC GGA ATC GCT GGC TAC AAT AGC AAT GGT TGT GTA TAT TCA GAG TCA 1265 
Pro Gly He Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser Glu Ser 
380 385 390 

TTC GGT CTT GGT GGT ATA GGA GTT CTT ACC CCT GAC AAC TAT CTG ATA 
Phe Gly Leu Gly Gly He Gly Val Leu Thr Pro Asp Asn Tyr Leu He 
395 400 405 

ACA CCG GCA TTG GAT TTG GCT AAC GGA GGT AAG TTG ACT TTC TGG GTA 
Thr Pro Ala Leu Asp Leu Ala Asn Gly Gly Lys Leu Thr Phe Trp Val 
410 415 420 425 

TGC GCA CAG GAT GCT AAT TAT GCA TCC GAG CAC TAT GCG GTG TAT GCA 
Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala 
430 435 440 

TCT TCG ACC GGT AAC GAT GCA TCC AAC TTC ACG AAT GCT TTG TTG GAA 
Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu Leu Glu 
445 450 455 

GAG ACG ATT ACG GCA AAA GGT GTT CGC TCG CCG GAA GCT ATT CGT GGT 
Glu Thr He Thr Ala Lys Gly Val Arg Ser Pro Glu Ala He Arg Glv 
460 465 470 



1313 



1361 



1409 



1457 



1505 
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CGT ATA CAG GGT ACT TGG CGC CAG AAG ACG GTA GAC CTT CCC GCA GGT 1553 
Arg lie Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro Ala Gly 
475 480 485 

ACG AAA TAT GTT GCT TTC CGT CAC TTC CAA AGC ACG GAT ATG TTC TAC 1601 
Thr Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met Phe Tyr 
490 495 500 505 

ATC GAC CTT GAT GAG GTT GAG ATC AAG GCC AAT GGC AAG CGC GCA GAC 1649 
lie Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala Asp 
510 515 520 

TTC ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCA GCG GAA 1697 
Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu 
525 530 535 

TGG ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GAT TGG CTC TGT CTG 1745 
Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Asp Trp Leu Cys Leu 
540 545 550 

TCT TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA 17 93 

Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val 
555 560 565 

GTA GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT 1841 
Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr 
570 575 580 585 

CTC ATC TCA AAG GAT GTT ACA GGC GCA ACG AAG GTA AAG TAC TAC TAT 1889 
Leu He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr 
590 595 600 

GCA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC 1937 
Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He Ser 
605 610 615 

AAG ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG 198 5 

Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr 
620 625 630 

CCT AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA 2033 
Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu 
635 640 645 

GCC AAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT 2081 
Ala Asn Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val Asp 
650 655 660 665 

TTG CCT GCG GGC ACG AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC TCG 2129 
Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser 
670 675 680 

GAT TTG GAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC 2177 
Asp Leu Asp Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly Gly 
685 690 695 

AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTA TAT CGT GAT GGT ACG 2225 
Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr 
700 705 710 

AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGC GTA 227 3 

Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val 
715 720 725 
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GCT ACG GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC 2321 
Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly 
730 735 740 745 

GTA TCT CCG AAG GTG TGT GTA AAC GTA ACT ATT AAT CCG ACT CAG TTC 2369 
Val Ser Pro Lys Val Cys Val Asn Val Thr He Asn Pro Thr Gin Phe 
750 755 760 

AAT CCT GTA AAG AAC CTG AAG GCA CAA CCG GAT GGC GGC GAC GTG GTT 2417 
Asn Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly Gly Asp Val Val 
765 770 775 

CTC AAG TGG GAA GCC CCG AGT GGC AAA CGA GGA GAA CTG CTT AAT GAA 2465 
Leu Lys Trp Glu Ala Pro Ser Gly Lys Arg Gly Glu Leu Leu Asn Glu 
780 785 790 

GAT TTT GAA GGA GAC GCT ATT CCC ACA GGG TGG ACA GCA TTG GAT GCC 2513 
Asp Phe Glu Gly Asp Ala He Pro Thr Gly Trp Thr Ala Leu Asp Ala 
795 800 805 

GAT GGT GAC GGT AAT AAC TGG GAT ATC ACG CTC AAT GAA TTT ACG CGA 2561 
Asp Gly Asp Gly Asn Asn Trp Asp lie Thr Leu Asn Glu Phe Thr Arg 
610 815 820 825 

GGA GAG CGT CAT GTT CTT TCA CCT TTA CGC GCC AGC AAC GTA GCC ATA 2 609 

Gly Glu Arg His Val Leu Ser Pro Leu Arg Ala Ser Asn Val Ala He 
830 835 840 

TCC TAT TCT TCT TTA CTT CAG GGT CAA GAA TAT TTG CCT CTC ACG CCG 2657 
Ser Tyr Ser Ser Leu Leu Gin Gly Gin Glu Tyr Leu Pro Leu Thr Pro 
845 850 855 

AAC AAC TTT CTG ATC ACT CCG AAG GTT GAA GGA GCA AAG AAG ATT ACT 2705 
Asn Asn Phe Leu He Thr Pro Lys Val Glu Gly Ala Lys Lys He Thr 
860 865 870 

TAT AAG GTG GGT TCA CCG GGT CTT CCT CAA TGG AGT CAT GAT CAT TAT 2753 
Tyr Lys Val Gly Ser Pro Gly Leu Pro Gin Trp Ser His Asp His Tyr 
675 880 885 

GCA CTC TGT ATC TCC AAG AGC GGA ACG GCT GCA GCC GAC TTC GAA GTA 2801 
Ala Leu Cys He Ser Lys Ser Gly Thr Ala Ala Ala Asp Phe Glu Val 
"0 895 900 905 

ATC TTT GAA GAA ACG ATG ACC TAC ACT CAA GGA GGA GCC AAC TTG ACA 2849 
He Phe Glu Glu Thr Met Thr Tyr Thr Gin Gly Gly Ala Asn Leu Thr 
910 915 920 

AGA GAA AAA GAC CTC CCT GCC GGC ACG AAA TAT GTC GCT TTC CGT CAT 28 97 

Arg Glu Lys Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His 
925 930 935 

TAC AAT TGC ACG GAT GTT CTG GGC ATA ATG ATT GAC GAT GTA GTG ATA 2945 
Tyr Asn Cys Thr Asp Val Leu Gly He Met He Asp Asp Val Val He 
940 945 950 

ACA GGT GAA GGC GAA GGT CCC AGT TAC ACC TAC ACG GTG TAT CGT GAC 2993 
Thr Gly Glu Gly Glu Gly Pro Ser Tyr Thr Tyr Thr Val Tyr Arg Asp 
955 960 965 

GGC ACG AAG ATC CAG GAA GGT CTG ACC GAA ACG ACC TAC CGC GAT GCA 3041 
Gly Thr Lys He Gin Glu Gly Leu Thr Glu Thr Thr Tyr Arg Asp Ala 
970 975 980 985 
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GGA ATG AGT GCA CAA TCT CAT GAG TAT TGC GTA GAG GTT AAG TAC GCA 308 9 

Gly Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu Val Lys Tyr Ala 
990 995 1000 

GCC GGC GTA TCT CCG AAG GTT TGT GTG GAT TAT ATT CCT GAT GGA GTG 3137 
Ala Gly Val Ser Pro Lys Val Cys Val Asp Tyr He Pro Asp Gly Val 
1005 1010 1015 

GCA GAC GTA ACT GCT CAG AAG CCT TAC ACG CTG ACG GTT GTA GGA AAG 3185 
Ala Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys 
1020 1025 1030 

ACT ATC ACG GTA ACT TGC CAA GGC GAA GCT ATG ATC TAC GAC ATG AAC 3233 
Thr He Thr Val Thr Cys Gin Gly Glu Ala Met He Tyr Asp Met Asn 
1035 1040 1045 

GGT CGT CGT CTG GCA GCG GGT CGC AAC ACG GTT GTT TAC ACG GCT CAG 3281 
Gly Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin 
1050 1055 1060 1065 

GGC GGC TAC TAT GCA GTC ATG GTT GTC GTT GAC GGC AAG TCT TAC GTA 3329 
Gly Gly Tyr Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr Val 
1070 1075 1080 

GAG AAA CTC GCT ATC AAG TAATTCTGTC TTGGACTCGG AGACTTTGTG 3377 
Glu Lys Leu Ala He Lys 
1085 



CAGACACTTT 


TAATATAGGT 


CTGTAATTGT 


CTCAGAGTAT 


GAATCGGTCG 


CCCGACTTCC 


3437 


TTAAAAGGAG 


GTCGGGCGAC 


TTCGTTTTTA 


TTATTGCTGT 


CTGGTAAACT 


TGTCAAGAGG 


3497 


AGACCTTTGA 


AAAAT GGGGC 


GGT CAATAAT 


TTTCGGTCTA 


TGGGTCAAAT 


TGCAGGCTAC 


3557 


TGTTTTAGGT 


GTATGTTGGG 


CTATCTTCCT 


AT CTTTAAGA. 


GACCTTTGAA 


AAATAAGGAG 


3617 


ATGGAGGGAA 


GAGGAGTTCT 


TGGCATAAAA 


GGAGCGAGTG 


AAAGGGGTGG 


CAGTAAGGAG 


3677 


TGAAAGTAGT 


TGTAAATCCC 


CCCTTTGAGG 


AGCTACTTGT 


ACGAGCTCCT 


CAAGGGTGGT 


3737 


TATGCCTTAT 


CCTACGGATG 


AGGACATAAT 


TATCCCCGGC 


GTTCTGTATA 


AATTAAAGGC 


3797 


GATGCTTTCA 


AGAATGTTTT 


GAGTATGGGT 


CTTGGCAAGT 


CCCCGGTATC 


GACATGTCCG 


3857 


CCATGAAACC 


ACCGGCGAAT 


ACTGCCAAAG 


GTGCGTTCGA 


TGGTGCTCCG 


TAT CGGACT G 


3917 


ATTGCTTTGT 


TTCGTTGCTT 


CTCTTCCTCG 


GTCAATGCCC 


TGTTGCGTTG 


TGCCTTGTGC 


3977 


ATAATGCCGT 


CTTGAAGGTG 


ATGGGTTTGC 


AGGTAGGAAC 


GATTTTCCCC 


GCAAGCATAT 


4037 


CCTTTGTCCG 


CCAAGACGGC 


TGTACCTTGA 


GGTATGTTTG 


CAC 




4080 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1087 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
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Met Gly Thr Val Val Ala Asp Pro Thr Val Ala Ala Pro Val Lys Met 
15 10 15 

Ala Lys Gin lie Ala Glu Asn Gly Asn Tyr Asp Val Val Met Thr "Arg 
20 25 30 

Ser Asn Tyr Leu Pro Val lie Asn Gin lie Gin Ala Gly Glu Pro Ser 
35 40 45 

Pro Tyr Gin Pro Val Asn Asn Leu Thr Ala Pro Pro Glu Gly Glu Glu 
50 55 60 

Val Ala Leu Lys Trp Asp Thr Pro Ser Ala Lys Lys Ala Glu Ala Ser 
65 70 75 80 

Arg Glu Val Lys Arg lie Gly Asp Gly Leu Phe Val Thr lie Glu Pro 
85 90 95 

Ala Asn Asp Val Arg Ala Asn Glu Ala Lys Val Val Leu Ala Ala Asp 
100 105 110 

Asn Val Trp Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp 
115 120 125 

His Asn Thr Phe Gly Ser Val He Pro Ala Thr Gly Pro Leu Phe Thr 
130 135 140 

Gly Thr Ala Ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu He 
145 150 155 160 

Pro Ala Asn Ala Asp Pro Val Val Thr Thr Gin Asn He He Val Thr 
165 170 175 

Gly Gin Gly Glu Val Val He Pro Gly Gly Val Tyr Asp Tyr Cys He 
180 185 190 

Thr Asn Pro Glu Pro Ala Ser Gly Lys Met Trp He Ala Gly Asp Gly 
195 200 205 

Asp Asn Gin Pro Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala Gly Lys 
210 215 220 

Lys Tyr Thr Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp 
225 230 235 240 

Met Glu Val Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr 
245 250 255 

Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Ala Thr Thr Phe Glu 
260 265 270 

Glu Asp Gly Val Ala Ala Gly Asn His Glu Tyr Cys Val Glu Val Lys 
275 280 285 

Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Lys Asp Val Thr Val Glu 
290 295 300 

Gly Ser Asn Glu Phe Ala Pro Val Gin Asn Leu Thr Gly Ser Ala Val 
305 310 315 320 

Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn 
325 330 335 
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Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe 
340 345 350 

Glu Asn Gly lie Pro Ala Ser Trp Lys Thr lie Asp Ala Asp Gly Asp 
355 360 365 

Gly His Gly Trp Lys Pro Gly Asn Ala Pro Gly lie Ala Gly Tyr Asn 
370 375 380 

Ser Asn Gly Cys Val Tyr Ser Glu Ser Phe Gly Leu Gly Gly lie Gly 
385 390 395 400 

Val Leu Thr Pro Asp Asn Tyr Leu He Thr Pro Ala Leu Asp Leu Ala 
405 410 415 

Asn Gly Gly Lys Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr 
420 425 430 

Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala 
435 440 445 

Ser Asn Phe Thr Asn Ala Leu Leu Glu Glu Thr He Thr Ala Lys Gly 
450 455 460 

Val Arg Ser Pro Glu Ala He Arg Gly Arg He Gin Gly Thr Trp Arg 
465 470 475 460 

Gin Lys Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg 
485 490 495 

His Phe Gin Ser Thr Asp Met Phe Tyr He Asp Leu Asp Glu Val Glu 
500 505 510 

He Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser 
515 520 525 

Ser Thr His Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp 
530 535 540 

Gly Asp Gly Gin Asp Trp Leu Cys Leu Ser Ser Gly Gin Leu Asp Trp 
545 550 555 560 

Leu Thr Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn 
565 570 575 

Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu He Ser Lys Asp Val Thr 
580 585 590 

Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro 
595 600 605 

Gly Asp His Tyr Ala Val Met He Ser Lys Thr Gly Thr Asn Ala Gly 
610 615 620 

Asp Phe Thr Val Val Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly 
625 630 635 640 

Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala Asn Gly Ala Lys Pro Gin 
645 650 655 

Ser Val Trp He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr 
660 665 670 
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Val Ala Phe Arg His Tyr Asn Cys Ser Asp Leu Asp Tyr lie Leu Leu 
675 680 685 

Asp Asp lie Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr 
690 695 700 

Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Gly Leu Thr 
705 710 715 720 

Glu Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr 
725 730 735 

Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Val 
740 745 750 

Asn Val Thr lie Asn Pro Thr Gin Phe Asn Pro Val Lys Asn Leu Lys 
755 760 765 

Ala Gin Pro Asp Gly Gly Asp Val Val Leu Lys Trp Glu Ala Pro Ser 
770 775 780 

Gly Lys Arg Gly Glu Leu Leu Asn Glu Asp Phe Glu Gly Asp Ala lie 
785 790 795 800 

Pro Thr Gly Trp Thr Ala Leu Asp Ala Asp Gly Asp Gly Asn Asn Trp 
805 810 815 

Asp lie Thr Leu Asn Glu Phe Thr Arg Gly Glu Arg His Val Leu Ser 
820 825 830 

Pro Leu Arg Ala Ser Asn Val Ala lie Ser Tyr Ser Ser Leu Leu Gin 
835 840 845 

Gly Gin Glu Tyr Leu Pro Leu Thr Pro Asn Asn Phe Leu He Thr Pro 
850 855 860 

Lys Val Glu Gly Ala Lys Lys He Thr Tyr Lys Val Gly Ser Pro Gly 
865 870 875 880 

Leu Pro Gin Trp Ser His Asp His Tyr Ala Leu Cys He Ser Lys Ser 
885 890 895 

Gly Thr Ala Ala Ala Asp Phe Glu Val He Phe Glu Glu Thr Met Thr 
900 905 910 

Tyr Thr Gin Gly Gly Ala Asn Leu Thr Arg Glu Lys Asp Leu Pro Ala 
915 920 925 

Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Thr Asp Val Leu 
930 935 940 

Gly He Met He Asp Asp Val Val He Thr Gly Glu Gly Glu Gly Pro 
945 950 955 960 

Ser Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Gin Glu Gly 
965 970 975 

Leu Thr Glu Thr Thr Tyr Arg Asp Ala Gly Met Ser Ala Gin Ser His 
980 985 990 

Glu Tyr Cys Val Glu Val Lys Tyr Ala Ala Gly Val Ser Pro Lys Val 
995 1000 1005 
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Cys Val Asp Tyr He Pro Asp Gly Val Ala Asp Val Thr Ala Gin Lys 
1010 1015 1020 

Pro Tyr Thr Leu Thr Val Val Gly Lys Thr He Thr Val Thr Cys Gin 
1025 1030 1035 1040 

Gly Glu Ala Met He Tyr Asp Met Asn Gly Arg Arg Leu Ala Ala Gly 
1045 1050 1055 

Arg Asn Thr Val Val Tyr Thr Ala Gin Gly Gly Tyr Tyr Ala Val Met 
1060 1065 1070 

Val Val Val Asp Gly Lys Ser Tyr Val Glu Lys Leu Ala He Lys 
1075 1080 1085 



£2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 95 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

<B> LOCATION: 696.-5894 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGATCCTACG CCCGATACCC ATACT CGAAG CCTTTGCTCA GTACCATCCT GCAGAAGGTT 60 

ACTCTTTCGC ATATAGTGAC CCTCTTTTCT CT CAGCATAA TGGTACCTAT CAT AT CAGT A 120 

AGGGGCGTAT TGTCTTTTCG AACAATGTAC AGCCCGAGAA CT CTTTACT T CCACATCACA 180 

CCCCCGACTC CTTAGTCAAG GATCTTTTTT CCCCTTTCCC CTCCGCTCTC TTCCTCATGC 240 

TGGACTGACT TAACCTTGGT CTGCTCTACT TTTCGGTTGT AAATACATGC AACACAATAA 300 

CTTTAAGTGT TGTTAGACAA CACTTTTACA AGACTCTGAC TTTTAATGAG GTGGAGCATG 360 

AACCTTTTCC TCTTTCATCT TCTCCTTCAG ATTACAGTCA ATATTTTGGC AAAAGGCTAA 420 

TTGACAGCCT TTTATAAGGG TTAATCCCTT GTGGCTTATA TTGAAAACAT GTT CTTTATA 480 

ATCCGATACT CTTCTTAAAT CGAATTTTTT CTCTAAATTG CGCCGCAACA AAACTCCTTG 540 

AGAAAAGTAC CAATAGAAAT AGAAGGTAGC ATTTTGCCTT TAAATTCCTT TTCTTTTCTT 600 

GGATTGTTCT TGAAATGAAT CTTATTTGTG GATTTTTTTT GTTTTTTTAA CCCGGCCGTG 660 

GTTCTCTGAA T CACGAC CAT AAATTGTTTT AAAGT ATG AGG AAA TTA TTA TTG 713 

Met Arg Lys Leu Leu Leu 
1 5 

CTG ATC GCG GCG TCC CTT TTG GGA GTT GGT CTT TAC GCC CAA AGC GCC 7 61 

Leu lie Ala Ala Ser Leu Leu Gly Val Gly Leu Tyr Ala Gin Ser Ala 
10 15 20 
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AAG ATT AAG CTT GAT GCT CCG ACT ACT CGA ACG ACA TGT ACG AAC AAT 809 
Lys lie Lys Leu Asp Ala Pro Thr Thr Arg Thr Thr Cys Thr Asn Asn 
25 30 35 

AGC TTC AAG CAG TTC GAT GCA AGC TTT TCG TTC AAT GAA GTC GAG CTG 857 
Ser Phe Lys Gin Phe Asp Ala Ser Phe Ser Phe Asn Glu Val Glu Leu 
40 45 50 

ACA AAG GTG GAG ACC AAA GGT GGT ACT TTC GCC TCA GTG TCA ATT CCG 905 
Thr Lys Val Glu Thr Lys Gly Gly Thr Phe Ala Ser Val Ser lie Pro 
55 60 65 70 

GGT GCA TTC CCG ACC GGT GAG GTT GGT TCT CCC GAA GTG CCA GCA GTT 953 
Gly Ala Phe Pro Thr Gly Glu Val Gly Ser Pro Glu Val Pro Ala Val 
75 80 85 

AGG AAG TTG ATT GCT GTG CCT GTC GGA GCC ACA CCT GTT GTT CGC GTG 1001 
Arg Lys Leu lie Ala Val Pro Val Gly Ala Thr Pro Val Val Arg Val 
90 95 100 

AAA AGT TTT ACC GAG CAA GTT TAC TCT CTG AAC CAA TAC GGT TCC GAA 1049 
Lys Ser Phe Thr Glu Gin Val Tyr Ser Leu Asn Gin Tyr Gly Ser Glu 
105 110 115 

AAA CTC ATG CCA CAT CAA CCC TCT ATG AGC AAG AGT GAT GAT CCC GAA 1097 
Lys Leu Met Pro His Gin Pro Ser Met Ser Lys Ser Asp Asp Pro Glu 
120 125 130 

AAG GTT CCC TTC GTT TAC AAT GCT GCT GCT TAT GCA CGC AAA GGT TTT 1145 
Lys Val Pro Phe Val Tyr Asn Ala Ala Ala Tyr Ala Arg Lys Gly Phe 
135 140 145 150 

GTC GGA CAA GAA CTG ACC CAA GTA GAA ATG TTG GGG ACA ATG CGT GGT 1193 
Val Gly Gin Glu Leu Thr Gin Val Glu Met Leu Gly Thr Met Arg Gly 
155 160 165 

GTT CGC ATT GCA GCT CTT ACC ATT AAT CCT GTT CAG TAT GAT GTG GTT 1241 
Val Arg lie Ala Ala Leu Thr lie Asn Pro Val Gin Tyr Asp Val Val 
170 175 180 

GCA AAC CAA TTG AAG GTT AGA AAC AAC ATC GAA ATT GAA GTA AGC TTT 1289 
Ala Asn Gin Leu Lys Val Arg Asn Asn He Glu He Glu Val Ser Phe 
185 190 195 

CAA GGA GCT GAT GAA GTA GCT ACA CAA CGT TTG TAT GAT GCT TCT TTT 1337 
Gin Gly Ala Asp Glu Val Ala Thr Gin Arg Leu Tyr Asp Ala Ser Phe 
200 205 210 

AGC CCT TAT TTC GAA ACA GCT TAT AAA CAG CTC TTC AAT AGA GAT GTT 1385 
Ser Pro Tyr Phe Glu Thr Ala Tyr Lys Gin Leu Phe Asn Arg Asp Val 
215 220 225 230 

TAT ACA GAT CAT GGC GAC TTG TAT AAT ACG CCG GTT CGT ATG CTT GTT 1433 
Tyr Thr Asp His Gly Asp Leu Tyr Asn Thr Pro Val Arg Met Leu Val 
235 240 245 

GTT GCA GGT GCA AAA TTC AAA GAA GCT CTC AAG CCT TGG CTC ACT TGG 14 81 

Val Ala Gly Ala Lys Phe Lys Glu Ala Leu Lys Pro Trp Leu Thr Trp 
250 255 260 

AAG GCT CAA AAG GGC TTC TAT CTG GAT GTG CAT TAC ACA GAC GAA GCT 1529 
Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val His Tyr Thr Asp Glu Ala 
265 270 275 
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GAA GTA GGA ACG ACA AAC GCC TCT ATC AAG GCA TTT ATT CAC AAG AAA 1577 
Glu Val Gly Thr Thr Asn Ala Ser He Lys Ala Phe He His Lys Lys 
280 285 290 

TAC AAT GAT GGA TTG GCA GCT AGT GCT GCT CCG GTC TTC TTG GCT TTG 1625 
Tyr Asn Asp Gly Leu Ala Ala Ser Ala Ala Pro Val Phe Leu Ala Leu 
295 300 305 310 

GTT GGT GAC ACT GAC GTT ATT AGC GGA GAA AAA GGA AAG AAA ACA AAA 1673 
Val Gly Asp Thr Asp Val He Ser Gly Glu Lys Gly Lys Lys Thr Lys 
315 320 325 

AAA GTT ACC GAC TTG TAT TAC AGT GCA GTC GAT GGC GAC TAT TTC CCT 1721 
Lys Val Thr Asp Leu Tyr Tyr Ser Ala Val Asp Gly Asp Tyr Phe Pro 
330 335 340 

GAA ATG TAT ACT TTC CGT ATG TCT GCT TCT TCC CCA GAA GAA CTG ACG 1769 
Glu Met Tyr Thr Phe Arg Met Ser Ala Ser Ser Pro Glu Glu Leu Thr 
345 350 355 

AAC ATC ATT GAT AAG GTA TTG ATG TAT GAA AAG GCT ACT ATG CCA GAT 1817 
Asn He He Asp Lys Val Leu Met Tyr Glu Lys Ala Thr Met Pro Asp 
360 365 370 

AAG AGT TAT TTG GAG AAA GTT CTC TTG ATT GCA GGT GCA GAT TAT AGC 1865 
Lys Ser Tyr Leu Glu Lys Val Leu Leu He Ala Gly Ala Asp Tyr Ser 
375 380 385 390 

TGG AAT TCC CAG GTA GGT CAG CCA ACC ATT AAA TAC GGT ATG CAG TAC 1913 
Trp Asn Ser Gin Val Gly Gin Pro Thr He Lys Tyr Gly Met Gin Tyr 
395 400 405 

TAC TAC AAC CAA GAG CAT GGT TAT ACC GAC GTG TAC AAC TAT CTC AAA 1961 
Tyr Tyr Asn Gin Glu His Gly Tyr Thr Asp Val Tyr Asn Tyr Leu Lys 
410 415 420 

GCC CCT TAT ACA GGT TGC TAC AGT CAT TTG AAT ACC GGA GTC AGC TTT 2009 
Ala Pro Tyr Thr Gly Cys Tyr Ser His Leu Asn Thr Gly Val Ser Phe 
425 430 435 

GCA AAC TAT ACA GCG CAT GGA TCT GAG ACC GCA TGG GCT GAT CCA CTT 2057 
Ala Asn Tyr Thr Ala His Gly Ser Glu Thr Ala Trp Ala Asp Pro Leu 
440 445 450 



CTG ACT ACT TCT CAA CTG AAA GCA CTC ACT AAT AAG GAC AAA TAC TTC 2105 

Leu Thr Thr Ser Gin Leu Lys Ala Leu Thr Asn Lys Asp Lys Tyr Phe 

455 460 465 470 

TTA GCT ATT GGC AAC TGC TGT ATT ACA GCT CAA TTC GAT TAT GTA CAG 2153 

Leu Ala He Gly Asn Cys Cys He Thr Ala Gin Phe Asp Tyr Val Gin 
475 480 485 

CCT TGC TTC GGA GAG GTA ATA ACT CGC GTT AAG GAG AAA GGG GCT TAT 2201 

Pro Cys Phe Gly Glu Val He Thr Arg Val Lys Glu Lys Gly Ala Tyr 
490 495 500 

GCC TAT ATC GGT TCA TCT CCA AAT TCT TAT TGG GGC GAG GAC TAC TAT 22 4 9 

Ala Tyr He Gly Ser Ser Pro Asn Ser Tyr Trp Gly Glu Asp Tyr Tyr 
505 510 515 
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TGG AGT GTG GGT GCT AAT GCC GTA TTT GGT GTT GAG CCT ACT TTT GAA 2297 
Trp Ser Val Gly Ala Asn Ala Val Phe Gly Val Gin Pro Thr Phe Glu 
520 525 530 

GGT ACG TCT ATG GGT TCT TAT GAT GCT ACA TTC TTG GAG GAT TCG TAC 2345 
Gly Thr Ser Met Gly Ser Tyr Asp Ala Thr Phe Leu Glu Asp Ser Tyr 
535 540 545 550 

AAC ACA GTG AAT TCT ATT ATG TGG GCA GGT AAT CTT GCC GCT ACT CAT 2393 
Asn Thr Val Asn Ser lie Met Trp Ala Gly Asn Leu Ala Ala Thr His 
555 560 565 

GCT GGA AAT ATC GGC AAT ATT ACC CAT ATT GGT GCT CAT TAC TAT TGG 2441 
Ala Gly Asn lie Gly Asn He Thr His He Gly Ala His Tyr Tyr Trp 
570 575 580 

GAA GCT TAT CAT GTC CTT GGC GAT GGT TCG GTT ATG CCT TAT CGT GCA 2489 
Glu Ala Tyr His Val Leu Gly Asp Gly Ser Val Met Pro Tyr Arg Ala 
585 590 595 

ATG CCT AAG ACC AAT ACT TAT ACG CTT CCT GCC TCT TTG CCT CAG AAT 2537 
Met Pro Lys Thr Asn Thr Tyr Thr Leu Pro Ala Ser Leu Pro Gin Asn 
600 605 610 

CAG GCT TCT TAT AGC ATT CAG GCT TCT GCC GGT TCT TAC GTA GCT ATT 2585 
Gin Ala Ser Tyr Ser He Gin Ala Ser Ala Gly Ser Tyr Val Ala He 
615 620 625 630 

TCT AAA GAT GGA GTT TTG TAT GGA ACA GGT GTT GCT AAT GCC AGC GGT 2633 
Ser Lys Asp Gly Val Leu Tyr Gly Thr Gly Val Ala Asn Ala Ser Gly 
635 640 645 

GTT GCG ACT GTG AGT ATG ACT AAG CAG ATT ACG GAA AAT GGT AAT TAT 2681 
Val Ala Thr Val Ser Met Thr Lys Gin He Thr Glu Asn Gly Asn Tyr 
650 655 660 

GAT GTA GTT ATC ACT CGC TCT AAT TAT CTT CCT GTG ATC AAG CAA ATT 2729 
Asp Val Val He Thr Arg Ser Asn Tyr Leu Pro Val He Lys Gin He 
665 670 675 

CAG GTA GGT GAG CCT AGC CCC TAC CAG CCC GTT TCC AAC TTG ACA GCT 2777 
Gin Val Gly Glu Pro Ser Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala 
680 685 690 

ACA ACG CAG GGT CAG AAA GTA ACG CTC AAG TGG GAA GCA CCG AGC GCA 2825 
Thr Thr Gin Gly Gin Lys Val Thr Leu Lys Trp Glu Ala Pro Ser Ala 
695 700 705 710 

AAG AAG GCA GAA GGT TCC CGT GAA GTA AAA CGG ATC GGA GAC GGT CTT 2873 
Lys Lys Ala Glu Gly Ser Arg Glu Val Lys Arg He Gly Asp Gly Leu 
715 720 725 

TTC GTT ACG ATC GAA CCT GCA AAC GAT GTA CGT GCC AAC GAA GCC AAG 2921 
Phe Val Thr He Glu Pro Ala Asn Asp Val Arg Ala Asn Glu Ala Lys 
730 735 740 

GTT GTG CTT GCG GCA GAC AAC GTA TGG GGA GAC AAT ACG GGT TAC CAG 2969 
Val Val Leu Ala Ala Asp Asn Val Trp Gly Asp Asn Thr Gly Tyr Gin 
745 750 755 

TTC TTG TTG GAT GCC GAT CAC AAT ACA TTC GGA AGT GTC ATT CCG GCA 3017 
Phe Leu Leu Asp Ala Asp His Asn Thr Phe Gly Ser Val He Pro Ala 
760 • 765 770 
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ACC GGT CCT CTC TTT ACC GGA ACA GCT TCT TCC AAT CTT TAC AGT GCG 3065 
Thr Gly Pro Leu Phe Thr Gly Thr Ala Ser Ser Asn Leu Tyr Ser Ala 
775 780 785 790 

AAC TTC GAG TAT TTG GTC CCG GCC AAT GCC GAT CCT GTT GTT ACT ACA 3113 
Asn Phe Glu Tyr Leu Val Pro Ala Asn Ala Asp Pro Val Val Thr Thr 
795 800 805 

CAG AAT ATT ATC GTT ACA GGA CAG GGT GAA GTT GTA ATC CCC GGT GGT 3161 
Gin Asn lie lie Val Thr Gly Gin Gly Glu Val Val lie Pro Gly Gly 
810 815 820 

GTT TAC GAC TAT TGC ATT ACG AAC CCG GAA CCT GCA TCC GGA AAG ATG 3209 
Val Tyr Asp Tyr Cys lie Thr Asn Pro Glu Pro Ala Ser Gly Lys Met 
825 830 835 

TGG ATC GCA GGA GAT GGA GGC AAC CAG CCT GCA CGT TAT GAC GAT TTC 3257 
Trp lie Ala Gly Asp Gly Gly Asn Gin Pro Ala Arg Tyr Asp Asp Phe 
840 845 850 

ACA TTC GAA GCA GGC AAG AAG TAC ACC TTC ACG ATG CGT CGC GCC GGA 3305 
Thr Phe Glu Ala Gly Lys Lys Tyr Thr Phe Thr Met Arg Arg Ala Gly 
855 860 865 870 

ATG GGA GAT GGA ACT GAT ATG GAA GTC GAA GAC GAT TCA CCT GCA AGC 3353 
Met Gly Asp Gly Thr Asp Met Glu Val Glu Asp Asp Ser Pro Ala Ser 
875 880 885 

TAT ACC TAC ACG GTG TAT CGT GAC GGC ACG AAG ATC AAG GAA GGT CTG 3401 
Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Gly Leu 
890 895 900 

ACA GCT ACG ACA TTC GAA GAA GAC GGT GTA GCT GCA GGC AAT CAT GAG 3449 
Thr Ala Thr Thr Phe Glu Glu Aso Gly Val Ala Ala Gly Asn His Glu 
905 910 915 

TAT TGC GTG GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA TGT 3497 
Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys 
920 925 930 

AAA GAC GTT ACG GTA GAA GGA TCC AAT GAA TTT GCT CCT GTA CAG AAC 3545 
Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin Asn 
935 940 945 950 

CTG ACC GGT AGT TCA GTA GGT CAG AAA GTA ACG CTT AAG TGG GAT GCA 3593 
Leu Thr Gly Ser Ser Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala 
955 960 965 

CCT AAT GGT ACC CCG AAT CCG AAT CCA AAT CCG AAT CCG AAT CCG GGA 3641 
Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly 
970 975 980 

ACA ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCG GCA TCT TGG AAG 368 9 

Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser Trp Lys 
985 990 995 

ACG ATC GAT GCA GAC GGT GAC GGG CAT GGC TGG AAA CCT GGA AAT GCT 3737 
Thr lie Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly Asn Ala 
1000 1005 1010 

CCC GGA ATC GCT GGC TAC AAT AGC AAT GGT TGT GTA TAT TCA GAG TCA 37 85 

Pro Gly lie Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser Glu Ser 
1015 1020 1025 1030 
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TTC GGT CTT GGT GGT ATA GGA GTT CTT ACC CCT GAC AAC TAT CTG ATA 3833 
Phe Gly Leu Gly Gly lie Gly Val Leu Thr Pro Asp Asn Tyr Leu lie 
1035 1040 1045 

ACA CCG GCA TTG GAT TTG CCT AAC GGA GGT AAG TTG ACT TTC TGG GTA 3881 
Thr Pro Ala Leu Asp Leu Pro Asn Gly Gly Lys Leu Thr Phe Trp Val 
1050 1055 1060 

TGC GCA CAG GAT GCT AAT TAT GCA TCC GAG CAC TAT GCG GTG TAT GCA 3929 
Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala 
1065 1070 1075 

TCT TCG ACC GGT AAC GAT GCA TCC AAC TTC ACG AAT GCT TTG TTG GAA 3977 
Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu Leu Glu 
1080 1085 1090 

GAG ACG ATT ACG GCA AAA GGT GTT CGC TCG CCG AAA GCT ATT CGT GGT 4025 
Glu Thr He Thr Ala Lys Gly Val Arg Ser Pro Lys Ala He Arg Gly 
1095 1100 ^ 1105 1110 

CGT ATA CAG GGT ACT TGG CGC CAG AAG ACG GTA GAC CTT CCC GCA GGT 407 3 

Arg He Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro Ala Gly 
1115 1120 1125 

ACG AAA TAT GTT GCT TTC CGT CAC TTC CAA AGC ACG GAT ATG TTC TAC 4121 
Thr Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met Phe Tyr 
1130 1135 1140 

ATC GAC CTT GAT GAG GTT GAG ATC AAG GCC AAT GGC AAG CGC GCA GAC 4169 
He Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala Asp 
1145 1150 1155 

TTC ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCA GCG GAA 4217 
Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu 
1160 1165 1170 

TGG ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT CTG 4265 
Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu 
H" 7 * 1180 1185 1190 

TCT TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC AGC AAC GTA 4313 
Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Ser Asn Val 
1195 1200 1205 

GTA AGC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT 4361 
Val Ser Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr 
1210 1215 1220 

CTC ATC TCA AAG GAT GTT ACA GGC GCA ACG AAG GTA AAG TAC TAC TAT 4409 
Leu He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr 
1225 1230 1235 

GCA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC 4457 
Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He Ser 
1240 1245 1250 

AAG ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG 4505 
Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr 
1255 1260 1265 1270 

CCT AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA 4553 
Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu 
1275 1280 1285 
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GCA GAC GGT GAC GGC AAC AAT Tela Att. a>-o ^<~i. ^w-. — 

Ala Asp Gly Asp Sly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
r 1455 1460 

TCC TCT TTT GCA &QT i^v- GCG ATC TGT GTC TCT TCA C-_ 

Ser Ser Phe Ala Gly His Asn Ser Ala He cys Val Ser Ser Ala Ser 

1470 1475 
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GCC AAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT 4601 
Ala Asn Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val Asp 
1290 1295 1JUU 

TTG CCT GCA GGC ACG AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC TCG 4649 
Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser 
1305 1310 1315 

GAT TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC 4697 
Sp Su £n ?yr He Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly 
1320 1325 1330 

AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAT GGT ACG 4745 
S« Pro Thr Pro Thr Asp Tyr Thr Tyr Thr v«l Tyr Arg Asp Gly Thr 
1335 1340 1345 I 350 

AAG ATC AAG GAA GGT TTG ACC GAA ACG ACC TTC GAA GAA GAC GGC GTA 4793 
Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val 
' 1355 1360 1365 

GCT ACG GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC 4841 
Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly 
1370 1375 1380 

GTA TCT CCG AAG AAA TGT GTA GAC GTA ACT GTT AAT TCG ACA CAG TTC 4889 
Val Ser Pro Lys Lys Cys Val Asp Val Thr Val Asn Ser Thr Gin Phe 
1385 1390 1395 

AAT CCT GTA CAG AAC CTG ACG GCA GAA CAA GCT CCT AAC AGC ATG GAT 4937 
Asn Pro Val Gin Asn Leu Thr Ala Glu Gin Ala Pro Asn Ser Met Asp 
1400 1405 1410 

GCA ATC CTT AAA TGG AAT GCA CCG GCA TCT AAG CGT GCG GAA GTT CTG 4985 
Ala He Leu Lys Trp Asn Ala Pro Ala Ser Lys Arg Ala Glu Val Leu 
1415 1420 1425 14-*U 

AAC GAA GAC TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT 5033 
Asn Glu Asp Phe Glu Asn Gly lie Pro Ala Ser Trp Lys Thr He Asp 
1435 1440 1445 

GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC 5081 
Asp C """ ~ " 

1450 

TCC TCT TTT GCA GGT CAC AAC AGT GCG ATC TGT GTC TCT TCA GCT TCT 5129 
Phe J 
1465 



CAT ATC AAC TTT GAA GGT CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA 5177 
His He Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
1480 1485 1490 

CCG GAG CTT TCT CTT CCT GGC GGA GGA ACG CTT ACT TTC TGG GTA TGT 5225 
Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr Leu Thr Phe Trp Val Cys 
1495 1500 1505 1510 

GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAC GCA TCT 5273 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
1515 1520 1525 

TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA 5321 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
1530 1535 1540 
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GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGT 5369 
Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Arg Gly 
1545 1550 1555 

ACT CGT GCT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG 5417 
Thr Arg Ala Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
1560 1565 1570 

GGT ACT AAG TAT GTT GCC TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC 5465 
Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
1575 1580 1585 1590 

TGG ATC AAC CTT GAT GAT GTT GTA ATC ACT TCA GGG AAC GCT CCG TCT 5513 
Trp He Asn Leu Asp Asp Val Val He Thr Ser Gly Asn Ala Pro Ser 
1595 1600 1605 

TAC ACC TAT ACG ATC TAT CGT AAT AAT ACA CAG ATA GCA TCA GGC GTA 5561 
Tyr Thr Tyr Thr lie Tyr Arg Asn Asn Thr Gin He Ala Ser Gly Val 
1610 1615 1620 

ACG GAG ACT ACT TAC CGA GAT CCG GAC TTG GCT ACC GGT TTT TAC ACG 5609 
Thr Glu Thr Thr Tyr Arg Asp Pro Asp Leu Ala Thr Gly Phe Tyr Thr 
1625 1630 1635 

TAC GGT GTA AAG GTT GTT TAC CCG AAC GGA GAA TCA GCT ATC GAA ACT 5657 
Tyr Gly Val Lys Val Val Tyr Pro Asn Gly Glu Ser Ala He Glu Thr 
1640 1645 1650 

GCT ACG TTG AAT ATC ACT TCG TTG GCA GAC GTA ACG GCT CAG AAG CCT 5705 
Ala Thr Leu Asn He Thr Ser Leu Ala Asp Val Thr Ala Gin Lys Pro 
1655 1660 1665 1670 

TAC ACG CTG ACA GTT GTA GGA AAG ACG ATC ACG GTA ACT TGC CAA GGC 5753 
Tyr Thr Leu Thr Val Val Gly Lys Thr He Thr Val Thr Cys Gin Gly 
1675 1680 1685 

GAA GCT ATG ATC TAC GAC ATG AAC GGT CGT CGT CTG GCA GCG GGT CGC 5801 
Glu Ala Met He Tyr Asp Met Asn Gly Arg Arg Leu Ala Ala Gly Arg 
1690 1695 1700 

AAC ACG GTT GTT TAC ACG GCT CAG GGC GGC CAC TAT GCA GTC ATG GTT 584 9 

Asn Thr Val Val Tyr Thr Ala Gin Gly Gly His Tyr Ala Val Met Val 
1705 1710 1715 



GTC GTT GAC GGC AAG TCT TAC GTA 
Val Val Asp Gly Lys Ser Tyr Val 
1720 1725 


GAG AAA CTC 
Glu Lys Leu 


GCT GTA AAG TAAATCTGTC 

Ala Val Lys 

1730 


5901 


TTGGACTCGG 


AGACTTTGTG 


CAGACACTTT 


TAAGATAGGT 


CTGTAATTGT 


CTCAGAGTAT 


5961 


GAATCGGTCG 


CCCGACTTCC 


TTAAAAGGAG 


GTCGGGCGAC 


TTCGTTTTTA 


TTATTGCTGT 


6021 


CCGGTAAACT 


TGTCAAGAGG 


AGACCTTTGA 


AAAATGAGAC 


CTTTGCACGG 


CGATTGGTGT 


6081 


GTATTTT GTT 


TGTTAATTCA 


TTGTATAATA 


GGGAGTTATT 


TTGTATATTT 


GAGTATTAAA 


6141 


AACAGCATAA 


TATTCCTCCC 


AT GGCATAC C 


AAT C CAA GAA 


TACC GAT GAG 


CAT GT AACAT 


6201 


TTGCAGACGC 


ACT C CTTT CA 


AAGCGTTATC 


GCAAAGCACA 


AAACGACTTC 


CTCAATCAGG 


6261 


TTGACAGGCT 


TATCGATTGG 


CGTCCGATCA 


GGACGCTGAT 


CAACAAGAAA 


TACACGAAGC 


6321 


GACAAAATGC 


CATCGGCGCC 


CCGGCTTATG 


ACGTGATTCT 


CTTATT CAAG 


ATGTTGCTTC 


6381 
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CGAAGACATG GTACAACCTC AGTGATTGTG CTTTGGAGGA GCGCATCAAT GATTCAATCA 64 41 

CCTTTTCCCG ATTCTTGGGG CTATGGAAGA GGTATCTCCC GACCACAGCA C CAT CAGTCG 6501 

ATTTCGTTCG GCACTGACAG AGTTGGGGCT CATGGACAAA CTATTGGCGC AGTTTAACAA 6561 

ACAACTTTTC CGCCATCACA TTTCGGTCAG GGAAAGGGTG CTTGTCGATG CAAGCCTTGT 6621 

GGAGATACGG AGCACCATCG AACGCACCTT TGGCAGTATT CGCCGGTGGT TTCATGGCGG 6681 

ACGATGTCGA TACCGGGGAC TTGCCAAGAC C CAT ACT CAA. AACATTCTTG AAAGCATCGC 6741 

CTTTAATTTA TACAGAACCC CGGGGATAAT TATGTCCTCA TCTCTAGGAT AAGGTATAAC 6801 

CACCCTTGAG GAGCTCGTGC AAGCAGCTCC TCAAGGGGGA TTTACAACTA CTTTCACTCC 6861 

TTACCGCCAC CCTTTTCCCT CCCTCCCGGA ATTC 6895 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1732 amino acids 

(B) TYPE: amino acid 
{ D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Arg Lys Leu Leu Leu Leu lie Ala Ala Ser Leu Leu Gly Val Gly 
1 5 10 15 

Leu Tyr Ala Gin Ser Ala Lys lie Lys Leu Asp Ala Pro Thr Thr Arg 
20 25 30 

Thr Thr Cys Thr Asn Asn Ser Phe Lys Gin Phe Asp Ala Ser Phe Ser 
35 40 45 

Phe Asn Glu Val Glu Leu Thr Lys Val Glu Thr Lys Gly Gly Thr Phe 
50 55 60 

Ala Ser Val Ser He Pro Gly Ala Phe Pro Thr Gly Glu Val Gly Ser 
65 70 75 80 

Pro Glu Val Pro Ala Val Arg Lys Leu He Ala Val Pro Val Gly Ala 
85 90 95 

Thr Pro Val Val Arg Val Lys Ser Phe Thr Glu Gin Val Tyr Ser Leu 

100 105 HO 

Asn Gin Tyr Gly Ser Glu Lys Leu Met Pro His Gin Pro Ser Met Ser 
115 120 125 

Lys Ser Asp Asp Pro Glu Lys Val Pro Phe Val Tyr Asn Ala Ala Ala 
130 135 140 

Tyr Ala Arg Lys Gly Phe Val Gly Gin Glu Leu Thr Gin Val Glu Met 
145 150 155 160 

Leu Gly Thr Met Arg Gly Val Arg He Ala Ala Leu Thr He Asn Pro 
165 170 175 
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Val Gin Tyr Asp Val Val Ala Asn Gin Leu Lys Val Arg Asn Asn lie 
180 185 190 

Glu lie Glu Val Ser Phe Gin Gly Ala Asp Glu Val Ala Thr Gin Arg 
195 200 205 

Leu Tyr Asp Ala Ser Phe Ser Pro Tyr Phe Glu Thr Ala Tyr Lys Gin 
210 215 220 

Leu Phe Asn Arg Asp Val Tyr Thr Asp His Gly Asp Leu Tyr Asn Thr 
225 230 235 240 

Pro Val Arg Met Leu Val Val Ala Gly Ala Lys Phe Lys Glu Ala Leu 
245 250 255 

Lys Pro Trp Leu Thr Trp Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val 
260 265 270 

His Tyr Thr Asp Glu Ala Glu Val Gly Thr Thr Asn Ala Ser lie Lys 
275 280 285 

Ala Phe lie His Lys Lys Tyr Asn Asp Gly Leu Ala Ala Ser Ala Ala 
290 295 300 

Pro Val Phe Leu Ala Leu Val Gly Asp Thr Asp Val lie Ser Gly Glu 
305 310 315 320 

Lys Gly Lys Lys Thr Lys Lys Val Thr Asp Leu Tyr Tyr Ser Ala Val 
325 330 335 

Asp Gly Asp Tyr Phe Pro Glu Met Tyr Thr Phe Arg Met Ser Ala Ser 
340 345 350 

Ser Pro Glu Glu Leu Thr Asn lie lie Asp Lys Val Leu Met Tyr Glu 
355 360 365 

Lys Ala Thr Met Pro Asp Lys Ser Tyr Leu Glu Lys Val Leu Leu lie 
370 375 380 

Ala Gly Ala Asp Tyr Ser Trp Asn Ser Gin Val Gly Gin Pro Thr lie 
385 390 395 400 

Lys Tyr Gly Met Gin Tyr Tyr Tyr Asn Gin Glu His Gly Tyr Thr Asp 
4 05 410 415 

Val Tyr Asn Tyr Leu Lys Ala Pro Tyr Thr Gly Cys Tyr Ser His Leu 
420 425 430 

Asn Thr Gly Val Ser Phe Ala Asn Tyr Thr Ala His Gly Ser Glu Thr 
435 440 445 

Ala Trp Ala Asp Pro Leu Leu Thr Thr Ser Gin Leu Lys Ala Leu Thr 
450 455 460 

Asn Lys Asp Lys Tyr Phe Leu Ala lie Gly Asn Cys Cys lie Thr Ala 
465 470 475 480 

Gin Phe Asp Tyr Val Gin Pro Cys Phe Gly Glu Val lie Thr Arg Val 
485 490 495 

Lys Glu Lys Gly Ala Tyr Ala Tyr lie Gly Ser Ser Pro Asn Ser Tyr 
500 505 510 
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Trp Gly Glu Asp Tyr Tyr Trp Ser Val Gly Ala Asn Ala Val Phe Gly 

515 520 525 

Val Gin Pro Thr Phe Glu Gly Thr Ser Met Gly Ser Tyr Asp Ala Thr 
530 535 540 

Phe Leu Glu Asp Ser Tyr Asn Thr Val Asn Ser lie Met Trp Ala Gly 
545 550 555 560 

Asn Leu Ala Ala Thr His Ala Gly Asn lie Gly Asn He Thr His He 
565 570 575 

Gly Ala His Tyr Tyr Trp Glu Ala Tyr His Val Leu Gly Asp Gly Ser 
580 585 590 

Val Met Pro Tyr Arg Ala Met Pro Lys Thr Asn Thr Tyr Thr Leu Pro 
595 600 605 

Ala Ser Leu Pro Gin Asn Gin Ala Ser Tyr Ser He Gin Ala Ser Ala 
610 615 620 

Gly Ser Tyr Val Ala He Ser Lys Asp Gly Val Leu Tyr Gly Thr Gly 
625 630 635 640 

Val Ala Asn Ala Ser Gly Val Ala Thr Val Ser Met Thr Lys Gin He 
645 650 655 

Thr Glu Asn Gly Asn Tyr Asp Val Val He Thr Arg Ser Asn Tyr Leu 
660 665 670 

Pro Val He Lys Gin He Gin Val Gly Glu Pro Ser Pro Tyr Gin Pro 
675 680 685 

Val Ser Asn Leu Thr Ala Thr Thr Gin Gly Gin Lys Val Thr Leu Lys 
690 695 700 

Trp Glu Ala Pro Ser Ala Lys Lys Ala Glu Gly Ser Arg Glu Val Lys 
705 710 715 720 

Arg He Gly Asp Gly Leu Phe Val Thr He Glu Pro Ala Asn Asp Val 
725 730 735 

Arg Ala Asn Glu Ala Lys Val Val Leu Ala Ala Asp Asn Val Trp Gly 
740 745 750 

Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp His Asn Thr Phe 
755 760 765 

Gly Ser Val He Pro Ala Thr Gly Pro Leu Phe Thr Gly Thr Ala Ser 
770 775 780 

Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu Val Pro Ala Asn Ala 
785 790 795 800 

Asp Pro Val Val Thr Thr Gin Asn He He Val Thr Gly Gin Gly Glu 
805 810 815 

Val Val He Pro Gly Gly Val Tyr Asp Tyr Cys He Thr Asn Pro Glu 
820 825 830 

Pro Ala Ser Gly Lys Met Trp He Ala Gly Asp Gly Gly Asn Gin Pro 
835 840 845 
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Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala Gly Lys Lys Tyr Thr Phe 
650 855 860 

Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp Met Glu Val Glu 
B65 870 875 880 

Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr 
865 890 895 

Lys lie Lys Glu Gly Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val 
900 905 910 

Ala Ala Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly 
915 920 925 

Val Ser Pro Lys Val Cys Lys Asp Val Thr Val Glu Gly Ser Asn Glu 
930 935 940 

Phe Ala Pro Val Gin Asn Leu Thr Gly Ser Ser Val Gly Gin Lys Val 
945 950 955 960 

Thr Leu Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn 
965 970 975 

Pro Asn Pro Asn Pro Gly Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly 
980 985 990 

lie Pro Ala Ser Trp Lys Thr lie Asp Ala Asp Gly Asp Gly His Gly 
995 1000 1005 

Trp Lys Pro Gly Asn Ala Pro Gly lie Ala Gly Tyr Asn Ser Asn Gly 
1010 1015 1020 

Cys Val Tyr Ser Glu Ser Phe Gly Leu Gly Gly lie Gly Val Leu Thr 
1025 1030 1035 1040 

Pro Asp Asn Tyr Leu He Thr Pro Ala Leu Asp Leu Pro Asn Gly Gly 
1045 1050 1055 

Lys Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu 
1060 1065 1070 

His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe 
1075 1080 1085 

Thr Asn Ala Leu Leu Glu Glu Thr He Thr Ala Lys Gly Val Arg Ser 
1090 1095 1100 

Pro Lys Ala He Arg Gly Arg He Gin Gly Thr Trp Arg Gin Lys Thr 
1105 1110 1115 1120 

Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gin 
1125 1130 1135 

Ser Thr Asp Met Phe Tyr He Asp Leu Asp Glu Val Glu He Lys Ala 
1140 1145 1150 

Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His 
1155 1160 1165 

Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly 
1170 1175 1160 
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Gin Gly Trp Leu Cys Leu Q Ser Ser Gly Gln J?^* 8 * TrP ^ u0 

His Gly Gly Ser Asn Val Val Ser Ser Phe Ser Trp Asn Gly Met Ala 
1205 1210 1215 

Leu Asn Pro Asp Asn Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr 
1220 1225 1230 

Lvs Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His 
1235 1240 1245 

Tyr Ala Val Met He Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr 
1250 1255 1260 

Val val Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg 
1265 1270 1275 1280 

Phe Gly Leu Ser Thr Glu Ala Asn Gly Ala Lys Pro Gin Ser Val Trp 
1285 1290 1295 

He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe 
1300 1305 1310 

Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He 
1315 1320 1325 

Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr 
1330 1335 1340 

Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr 
1345 1350 1355 136Q 

Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu 
1365 1370 1375 

Val Lvs Tyr Thr Ala Gly Val Ser Pro Lys Lys Cys Val Asp Val Thr 
1380 1385 1390 

Val Asn Ser Thr Gin Phe Asn Pro Val Gin Asn Leu Thr Ala Glu Gin 
139 5 1400 1405 

Ala Pro Asn Ser Met Asp Ala He Leu Lys Trp Asn Ala Pro Ala Ser 
1410 1415 1420 

Lvs Arg Ala Glu Val Leu Asn Glu Asp Phe Glu Asn Gly He Pro Ala 
1425 1430 1435 1440 

Ser Trp Lys Thr He Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr 
1445 1450 1455 

Thr Pro Pro Pro Gly Gly Ser Ser Phe Ala Gly His Asn Ser Ala He 
1460 1465 1470 

Cys Val Ser Ser Ala Ser His He Asn Phe Glu Gly Pro Gin Asn Pro 
1475 1480 1485 

Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr 
1490 1495 1500 

Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His 
1505 1510 1515 1520 
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Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala 
1525 1530 1535 

Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala 
1540 1545 1550 

Pro Glu Ala lie Arg Gly Thr Arg Ala Gin Gly Thr Trp Tyr Gin Lys 
1555 1560 I5€i5 

Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe 
1570 1575 1580 

Gly Cys Thr Asp Phe Phe Trp lie Asn Leu Asp Asp Val Val lie Thr 
1585 1590 1595 1600 

Ser Gly Asn Ala Pro Ser Tyr Thr Tyr Thr lie Tyr Arg Asn Asn Thr 
1605 1610 1615 

Gin lie Ala Ser Gly Val Thr Glu Thr Thr Tyr Arg Asp Pro Asp Leu 
1620 1625 1630 

Ala Thr Gly Phe Tyr Thr Tyr Gly Val Lys Val Val Tyr Pro Asn Gly 
1635 1640 1645 

Glu Ser Ala lie Glu Thr Ala Thr Leu Asn lie Thr Ser Leu Ala Asp 
1650 1655 1660 

Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys Thr lie 
1665 1670 1675 1680 

Thr Val Thr Cys Gin Gly Glu Ala Met lie Tyr Asp Met Asn Gly Arg 
1685 1690 1695 

Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin Gly Gly 
1700 1705 1710 

His Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr Val Glu Lys 
1715 1720 1725 

Leu Ala Val Lys 
1730 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGAATGGGAG ATGGAACT 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 18 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
GTAACCCGTA TTGTCTCC 



18 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 858 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 365.-8248 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GATATCCGGC TCTTCGGCAG AGAATGC GAG AGATT CAGGA TATAT CGCAA CGGCCTTGTC 60 

AAGATCGAGG CCTCTTTAGG TCATGGATAT AACGTGAGTT CGAT GTAAGC TTTTCGGCCT 120 

TTCCATCATA CAATCGATTC GATTCTCTTT GGACT CAATA AAAAATATAA AATACTCAAA 18 0 

GAGTTGGCAT ATAACTTTGC CTCAGTGGCG AGTGGGTTTT TCGGCCAATT CCTAAAGAAG 240 

AAAATAGCTG TTTGTATCTT TTTGCGAAAA AAGTTT GGCG GATTAAGATT AAAAACATAT 300 

CTTTCGGGCG AT AGT GGT AG AGCACT AT CT TGCGAAACAT TAAT CTTTAA TACTTTCAAA 360 

AGGT ATG AGA AAA TTG AAT TCT TTA TTT TCG CTC GCC GTC CTA TTA TCC 
Met Arg Lys Leu Asn Ser Leu Phe Ser Leu Ala Val Leu Leu Ser 
15 10 15 

CTA TTG TGT TGG GGA CAG ACG GCT GCC GCA CAG GGA GGG CCG AAG ACT 4 57 

Leu Leu Cys Trp Gly Gin Thr Ala Ala Ala Gin Gly Gly Pro Lys Thr 
20 25 30 

GCT CCT TCT GTG ACG CAC CAA GCG GTG CAG AAA GGT ATT CGA ACA TCC Sf>S 
Ala Pro Ser Val Thr His Gin Ala Val Gin Lys Gly He Arg Thr Ser 
35 40 45 

AAG GTT AAG GAT CTC CGA GAT CCG ATT CCT GCC GGT ATG GCA CGA ATT 553 
Lys Val Lys Asp Leu Arg Asp Pro He Pro Ala Gly Met Ala Arg He 
50 55 . go 

ATC TTG GAG GCT CAC GAT GTA TGG GAA GAC GGC ACA GGC TAT CAA ATG 601 
lie Leu Glu Ala His Asp Val Trp Glu Asp Gly Thr Gly Tyr Gin Met 
65 70 75 
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CTT TGG GAT GCA GAT CAC AAT CAG TAC GGC GCA TCC ATT CCC GAA GAA 64 9 

Leu Trp Asp Ala Asp His Asn Gin Tyr Gly Ala Ser He Pro Glu Glu 
80 85 90 95 

TCT TTT TGG TTT GCC AAC GGA ACG ATC CCG GCC GGT CTT TAC GAT CCT 697 
Ser Phe Trp Phe Ala Asn Gly Thr He Pro Ala Gly Leu Tyr Asp Pro 
100 105 110 

TTC GAG TAT AAA GTT CCG GTC AAT GCC GAT GCA TCT TTT TCT CCC ACG 745 
Phe Glu Tyr Lys Val Pro Val Asn Ala Asp Ala Ser Phe Ser Pro Thr 
115 120 125 

AAT TTC GTG CTT GAT GGA ACA GCA TCA GCC GAT ATT CCT GCC GGC ACT 793 
Asn Phe Val Leu Asp Gly Thr Ala Ser Ala Asp He Pro Ala Gly Thr 
130 135 140 

TAT GAC TAT GTA ATC ATT AAC CCC AAT CCT GGC ATA ATA TAT ATA GTA 841 
Tyr Asp Tyr Val He He Asn Pro Asn Pro Gly He He Tyr He Val 
145 150 155 

GGA GAG GGT GTC TCC AAA GGT AAC GAT TAT GTG GTA GAG GCC GGT AAG 889 
Gly Glu Gly Val Ser Lys Gly Asn Asp Tyr Val Val Glu Ala Gly Lys 
160 165 170 175 

ACT TAT CAT TTC ACT GTC CAA CGA CAA GGC CCC GGC GAT GCT GCG TCC 937 
Thr Tyr His Phe Thr Val Gin Arg Gin Gly Pro Gly Asp Ala Ala Ser 
180 185 190 

GTT GTA GTG ACC GGA GAA GGT GGC AAT GAA TTC GCT CCC GTA CAG AAT 985 
Val Val Val Thr Gly Glu Gly Gly Asn Glu Phe Ala Pro Val Gin Asn 
195 200 205 

CTC CAA TGG TCT GTA TCT GGG CAG ACA GTG ACC CTC ACT TGG CAA GCC 1033 
Leu Gin Trp Ser Val Ser Gly Gin Thr Val Thr Leu Thr Trp Gin Ala 
210 215 220 

CCC GCA TCC GAC AAA CGG ACT TAT GTG TTG AAC GAA AGC TTC GAT ACG 1081 
Pro Ala Ser Asp Lys Arg Thr Tyr Val Leu Asn Glu Ser Phe Asp Thr 
225 230 235 

CAA ACG CTT CCT AAC GGC TGG ACA ATG ATC GAT GCT GAT GGT GAT GGT 1129 
Gin Thr Leu Pro Asn Gly Trp Thr Met He Asp Ala Asp Gly Asp Gly 
240 245 250 255 

CAC AAT TGG CTA TCT ACA ATA AAC GTT TAC AAC ACT GCT ACT CAT ACA 1177 
His Asn Trp Leu Ser Thr He Asn Val Tyr Asn Thr Ala Thr His Thr 
260 265 270 

GGT GAC GGT GCT ATG TTT AGC AAA TCA TGG ACT GCT AGC GGT GGT GCA 1225 
Gly Asp Gly Ala Met Phe Ser Lys Ser Trp Thr Ala Ser Gly Gly Ala 
275 280 285 

AAA ATT GAT TTG AGT CCT GAC AAC TAT TTG GTA ACT CCA AAG GTT ACG 1273 
Lys He Asp Leu Ser Pro Asp Asn Tyr Leu Val Thr Pro Lys Val Thr 
290 295 300 

GTT CCT GAG AAT GGT AAA CTT TCT TAT TGG GTT TCA TCT CAA GTG CCT 1321 
Val Pro Glu Asn Gly Lys Leu Ser Tyr Trp Val Ser Ser Gin Val Pro 
305 310 315 

TGG ACT AAT GAG CAT TAT GGA GTG TTC TTG TCC ACA ACC GGA AAC GAG 1369 
Trp Thr Asn Glu His Tyr Gly Val Phe Leu Ser Thr Thr Gly Asn Glu 
320 325 330 335 
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GCT GCA AAC TTT ACG ATA AAG CTA CTG GAA GAA ACC CTC GGA TCC GAC 1417 

Ala Ala Asn Phe Thr lie Lys Leu Leu Glu Glu Thr Leu Gly Ser Asp 
340 345 350 

AAA CCT GCT CCG ATG AAC TTG GTG AAG AGT GAA GGA GTA AAG CTT CCT 1465 
Lys Pro Ala Pro Met Asn Leu Val Lys Ser Glu Gly Val Lys Leu Pro 
355 360 365 

GCA CCT TAT CAG GAA AGA ACC ATC GAT CTC TCT GCC TAT GCC GGA CAA 1513 
Ala Pro Tyr Gin Glu Arg Thr He Asp Leu Ser Ala Tyr Ala Gly Gin 
370 375 380 

CAG GTG TAC TTG GCA TTC CGT CAT TTC AAC TCT ACA GGT ATA TTC CGT 1561 
Gin Val Tyr Leu Ala Phe Arg His Phe Asn Ser Thr Gly He Phe Arg 
385 390 395 

CTT TAT CTT GAT GAT GTG GCT GTT TCT GGT GAA GGT TCT TCC AAC GAC 1609 
Leu Tyr Leu Asp Asp Val Ala Val Ser Gly Glu Gly Ser Ser Asn Asp 
400 405 410 415 

TAC ACG TAC ACG GTA TAT CGT GAC AAT GTT GTT ATT GCC CAG AAT CTC 1657 
Tyr Thr Tyr Thr Val Tyr Arg Asp Asn Val Val He Ala Gin Asn Leu 
420 425 430 

GCG GCA ACG ACA TTC AAT CAG GAA AAT GTA GCT CCC GGC CAG TAT AAC 1705 
Ala Ala Thr Thr Phe Asn Gin Glu Asn Val Ala Pro Gly Gin Tyr Asn 
435 440 445 

TAC TGT GTT GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA TGT 1753 
Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys 
450 455 460 

AAA GAC GTT ACG GTA GAA GGA TCC AAC GAA TTT GCT CAT GTA CAG AAC 1801 
Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala His Val Gin Asn 
465 470 475 

CTG ACC GGT AGT GCA GTA GGT CAG AAA GTA ACG CTT AAG TGG GAT GCA 1849 
Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala 
480 485 490 4 95 

CCT AAT GGT ACC CCG AAT CCG AAT CCC GGA ACA ACA ACA CTT TCC GAA 18 97 

Pro Asn Gly Thr Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu 
500 505 sio 

TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT GCA GAC 1945 
Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr He Asp Ala Asp 
515 520 525 

GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC ACC TCT 1993 
Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser 
530 535 540 

TTT GCA GGT CAC AAC AGT GCA ATC TGT GCC TCT TCG GCT TCT TAT ATC 2041 
Phe Ala Gly His Asn Ser Ala He Cys Ala Ser Ser Ala Ser Tyr lie 
545 550 555 

AAC TTT GAA GGT CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA CCG GAG 2089 
Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu 
560 565 570 575 

CTA TCT CTT CCT AAC GGA GGA ACG CTT ACT TTC TGG GTA TGT GCA CAA 2137 
Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys Ala Gin 

580 . 585 590 
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GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAC GCA TCT TCT ACG 2185 

Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr 
595 600 605 

GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA GTG CTG 2233 
Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu 
610 615 620 

ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC ACT CGT 2281 
Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala lie Arg Gly Thr Arg 
625 630 635 

GTT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG GGT ACT 2329 
Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr 
640 645 650 655 

AAG TAT GTT GCT TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC TGG ATT 2377 
Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp lie 
660 665 670 

AAC CTT GAT GAT GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA GAC TTC 2425 
Asn Leu Asp Asp Val Glu lie Lys Ala Asn Gly Lys Arg Ala Asp Phe 
675 680 665 

ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG GAA TGG 2473 
Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp 
. 690 695 700 

ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT CTG TCT 2521 
Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser 
705 710 715 

TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA GTA 2569 
Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val 
720 725 730 735 

GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT CTC 2617 
Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu 
740 745 750 

ATC TCA AAG GAT GTT ACA GGC GCA ACT AAG GTA AAG TAC TAC TAT GCA 2665 
lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala 
755 760 765 

GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC AAG 2713 
Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie Ser Lys 
770 775 780 

ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG CCT 2761 
Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro 
785 790 795 

AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA GCC 28 09 

Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala 
800 805 810 815 

GAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT TTG 2857 
Asp Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu 
820 825 830 

CCT GCG GGT ACT AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC TCG GAT 2905 
Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp 
835 840 845 
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TTG AAC TAC ATT CTT TTG GAT GAT ATT GAG TTC ACC ATG GGT GGC AGC 2953 
Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly Glv Sex 
850 855 860 

CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT ACG AAG 3001 
Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys 
865 870 875 

ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGT GTA GCT 3049 
He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala 
880 885 890 895 

ACG GGC AAC CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC GTA 3097 
Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val 
900 905 910 

TCT CCG AAA GAG TGT GTA AAC GTA ACT GTT GAT CCT GTG CAG TTC AAT 314 5 

Ser Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn 
915 920 925 

CCT GTA CAG AAC CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTT 
Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu 
93° 935 940 

AAG TGG GAT GCA CCT AAT GGT ACC CCG AAT CCA AAT CCA AAT CCG AAT 3241 
Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn 
945 950 955 

CCG GGA ACA ACA ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC 
Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly He Pro Ala 
960 965 970 975 

TCA TGG AAG ACG ATC GAT GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG 
Ser Trp Lys Thr He Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr 
980 985 990 

ACC CCT CCT CCC GGA GGC ACC TCT TTT GCA GGT CAC AAC AGT GCG ATC 
Thr Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His Asn Ser Ala He 
995 1000 1005 

TGT GCC TCT TCG GCT TCT TAT ATC AAC TTT GAA GGC CCT CAG AAC CCT 
Cys Ala Ser Ser Ala Ser Tyr He Asn Phe Glu Gly Pro Gin Asn Pro 
1010 1015 1020 

GAT AAC TAT CTG GTT ACA CCG GAG CTA TCT CTT CCT AAC GGA GGA ACG 
Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Asn Gly Glv Thr 
1025 1030 1035 

CTT ACT TTC TGG GTA TGT GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC 3529 
™ Thr Phe Trp Val c *« Ala Gln Asn Tyr Ala Ser Glu His 

1040 1045 1050 1055 

TAT GCC GTG TAT GCA TCT TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC 
Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala 
1£) 60 1065 1070 

AAC GCT TTG TTG GAA GAA GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA 
Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala 
1075 1080 1085 

CCT GAA GCC ATT CGT GGC ACT CGT GTT CAG GGC ACC TGG TAT CAA AAG 
Pro Glu Ala He Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys 
1090 1095 iioo 
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ACG GTA GAG TTG CCT GCG GGT ACT AAG TAT GTT GCT TTC CGT CAC TTC 3721 
Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe 
1105 1110 1115 

GGC TGT ACG GAC TTC TTC TGG ATC AAC CTT GAT GAT GTT GAG ATC AAG 376 9 

Gly Cys Thr Asp Phe Phe Trp He Asn Leu Asp Asp Val Glu He Lys 
1120 1125 1130 1135 

GCC AAC GGC AAG CGC GCA GAC TTC ACG GAA ACG TTC GAG TCT TCT ACT 3817 
Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr 
1140 1145 1150- 

CAT GGA GAG GCA CCG GCG GAA TGG ACT ACT ATC GAT GCC GAT GGC GAT 38 65 

His Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp 
1155 1160 1165 

GGT CAG GGT TGG CTC TGT CTG TCT TCC GGA GAA TTG GGC TGG CTG ACA 3913 
Gly Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr 
1170 1175 1180 

GCT CAT GGC GGC ACC AAC GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG 3961 
Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn Gly Met 
1185 1190 1195 

GCT TTG AAT CCT GAT AAC TAT CTC ATC TCA AAG GAT GTT ACA GGC GCA 4009 
Ala Leu Asn Pro Asp Asn Tyr Leu He Ser Lys Asp Val Thr Gly Ala 
1200 1205 1210 1215 

ACT AAG GTA AAG TAC TAC TAT GCA GTC AAC GAC GGT TTT CCC GGG GAT 4057 
Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp 
1220 1225 1230 

CAC TAT GCG GTG ATG ATC TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC 4105 
His Tyr Ala Val Met He Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe 
1235 1240 1245 

ACG GTT GTT TTC GAA GAA ACG CCT AAC GGA ATA AAT AAG GGC GGA GCA 4153 
Thr Val Val Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala 
1250 1255 1260 

AGA TTC GGT CTT TCC ACG GAA GCC GAT GGC GCC AAA CCT GAA AGT GTA 42 01 

Arg Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys Pro Gin Ser Val 
1265 , 1270 1275 

TGG ATC GAG CGT ACG GTA GAT TTG CCT GCG GGT ACT AAG TAT GTT GCT 4249 
Trp lie Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala 
1280 1285 1290 1295 

TTC CGT CAC TAC AAT TGC TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT 4297 
Phe Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp 
1300 1305 1310 

ATT CAG TTC ACC ATG GGT GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC 4345 
He Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr 
1315 1320 1325 

ACG GTG TAT CGT GAC GGT ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG 4393 
Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr 
1330 1335 1340 

ACC TTC GAA GAA GAC GGT GTA GCT ACG GGC AAC CAT GAG TAT TGC GTG -4441 
Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val 
1345 1350 1355 
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GAA GTG AAG TAC ACA GCC GGC GTA TCT CCG AAA GAG TGT GTA AAC GTA 44 89 

Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Glu Cys Val Asn Val 
1360 1365 1370 1375 

ACT GTT GAT CCT GTG CAG TTC AAT CCT GTA CAG AAC CTG ACC GGT AGT 4537 
Thr Val Asp Pro Val Gin Phe Asn Pro Val Gin Asn Leu Thr Gly Ser 
1380 1385 1390 

GCA GTC GGC CAG AAA GTA ACG CTT AAG TGG GAT GCA CCT AAT GGT ACC 4585 
Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn Gly Thr 
1395 1400 1405 

CCG AAT CCA AAT CCA AAT CCG AAT CCG GGA ACA ACA ACA CTT TCC GAA 4633 
Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu 
1410 1415 1420 

TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT GCA GAC 4681 
Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr He Asp Ala Asp 
1425 1430 1435 

GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC ACC TCT 4729 
Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser 
1440 1445 1450 1455 

TTT GCA GGT CAC AAC AGT GCG ATC TGT GCC TCT TCG GCT TCT TAT ATC 4777 
Phe Ala Gly His Asn Ser Ala He Cys Ala Ser Ser Ala Ser Tyr He 
1460 1465 1470 

AAC TTT GAA GGC CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA CCG GAG 4825 
Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu 
1475 1480 1485 

CTA TCT CTT CCT AAC GGA GGA ACG CTT ACT TTC TGG GTA TGT GCA CAA 4873 
Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys Ala Gin 
1490 1495 1500 

GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAT GCA TCT TCT ACG 4921 
Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr 
1505 1510 1515 

GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA GTG CTG 4 969 

Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu 
1520 1525 1530 1535 

ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC ACT CGT 5017 
Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Arg Gly Thr Arg 
1540 1545 1550 

GTT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG GGT ACT 5 065 

Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr 
1555 1560 1565 

AAG TAT GTT GCT TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC TGG ATC 51 13 

Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp He 
1570 1575 1580 

AAC CTT GAT GAT GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA GAC TTC 5161 
Asn Leu Asp Asp Val Glu lie Lys Ala Asn Gly Lys Arg Ala Asp Phe 
1585 1590 1595 

ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG GAA TGG 5209 
Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp 
1600 1605 1610 1615 
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ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT CTG TCT 5257 

Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser 

1620 1625 1630 

TCC GGA CAA TTG GGC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA GTA 5305 
Ser Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn Val Val 
1635 1640 1645 

GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT CTC 5353 
Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu 
1650 1655 1660 

ATC TCA AAG GAT GTT ACA GGC GCA ACT AAG GTA AAG TAC TAC TAT GCA 5401 
lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala 
1665 1670 1675 

. GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC AAG 5449 
Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie Ser Lys 
1680 1665 1690 1695 

ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG CCT 5497 
Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro 
1700 1705 1710 

AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA GCC 5545 
Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala 
1715 1720 , 1725 

GAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT TTG 5593 
Asp Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu 
1730 1735 1740 

CCT GCG GGT ACT AAG TAT GTT GCT TTC CGA CAC TAC AAT TGC TCG GAT 5641 
Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp 
1745 1750 1755 

TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC AGC 5689 
Leu Asn Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly Ser 
1*760 1765 1770 1775 

CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT ACG AAG 5737 
Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys 
1780 1785 1790 

ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGT GTA GCT 5785 
He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala 
1795 1800 1805 

ACG GGC AAC CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC GTA 5833 
Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val 
1810 1815 1820 

TCT CCG AAA GAG TGT GTA AAC GTA ACT GTT GAT CCT GTG CAG TTC AAT 58 81 

Ser Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn 
1825 1830 1835 

CCT GTA CAG AAC CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTT 5929 
Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu 
lfi 40 1845 1850 1855 

AAG TGG GAT GCA CCT AAT GGT ACC CCG AAT CCA AAT CCA AAT CCG AAT 5977 
Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn 
1860 1865 1870 
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CCG GGA ACA ACA ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC 6025 
Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala 
1875 1880 1885 

TCA TGG AAG ACG ATC GAT GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG 6073 
Ser Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr 
1B90 1895 1900 

ACC CCT CCT CCC GGA GGC ACC TCT TTT GCA GGT CAC AAC AGT GCG ATC 6121 
Thr Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His Asn Ser Ala He 
1905 1910 1915 

TGT GTC TCT TCG GCT TCT TAT ATC AAC TTT GAA GGC CCT CAG AAC CCT 6169 
Cys Val Ser Ser Ala Ser Tyr He Asn Phe Glu Gly Pro Gin Asn Pro 
1920 1925 1930 1935 

GAT AAC TAT CTG GTT ACA CCG GAG CTA TCT CTT CCT GGC GGA GGA ACG 6217 
Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr 
1940 1945 1950 

CTT ACT TTC TGG GTA TGT GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC 6265 
Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His 
1955 I960 1965 

TAT GCC GTG TAT GCA TCT TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC 6313 
Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala 
1970 1975 1980 

AAC GCT TTG TTG GAA GAA GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA 6361 
Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala 
1985 1990 1995 

CCT GAA GCC ATT CGT GGC ACT CGT GTT CAG GGC ACC TGG TAT CAA AAG 6409 
Pro Glu Ala lie Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys 
2000 2005 2010 2015 

ACG GTA CAG TTG CCT GCG GGT ACT AAG TAT GTT GCC TTC CGT CAC TTC 6457 
Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe 
2020 2025 2030 

GGC TGT ACG GAC TTC TTC TGG ATC AAC CTT GAT GAA GTT GAG ATC AAG 6505 
Gly Cys Thr Asp Phe Phe Trp He Asn Leu Asp Glu Val Glu He Lys 
2035 2040 2045 

GCC AAC GGC AAG CGC GCA GAC TTC ACG GAA ACG TTC GAG TCT TCT ACT 6553 
Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr 
2050 2055 2060 

CAT GGA GAG GCA CCG GCG GAA TGG ACT ACT ATC GAT GCC GAT GGC GAT 6601 
His Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp 
2065 2070 2075 

GGT CAG GGT TGG CTC TGT CTG TCT TCC GGA CAA TTG GAC TGG CTG ACA 6649 
Gly Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr 
2080 2085 2090 2095 

GCT CAT GGC GGC ACC AAC GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG 6697 
Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn Gly Met 
2100 2105 2110 

GCT TTG AAT CCT GAT AAC TAT CTC ATC TCA AAG GAT GTT ACA GGC GCA 6745 
Ala Leu Asn Pro Asp Asn Tyr Leu He Ser Lys Asp Val Thr Gly Ala 
2115 2120 2125 
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ACT AAG GTA AAG TAC TAG TAT GCA GTC AAC GAC GGT TTT CCC GGG GAT 6793 
Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp 
2130 2135 2140 

CAC TAT GCG GTG ATG ATC TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC 6841 
His Tyr Ala Val Met lie Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe 
2145 2150 2155 

ACG GTT GTT TTC GAA GAA ACG CCT AAC GGA ATA AAT AAG GGC GGA GCA 688 9 

Thr Val Val Phe Glu Glu Thr Pro Asn Gly lie Asn Lys Gly Gly Ala 
2160 2165 2170 2175 

AGA TTC GGT CTT TCC ACG GAA GCC GAT GGC GCC AAA CCT CAA AGT GTA 6937 
Arg Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys Pro Gin Ser Val 
2180 2185 2190 

TGG ATC GAG CGT ACG GTA GAT TTG CCT GCG GGC ACG AAG TAT GTT GCT 6985 
Trp lie Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala 
2195 2200 2205 

TTC CGT CAC TAC AAT TGC TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT 7033 
Phe Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp 
2210 2215 2220 

ATT CAG TTC ACC ATG GGT GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC 7081 
He Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr 
2225 2230 2235 

ACG GTG TAT CGT GAC GGT ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG 7129 
Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr 
2 240 2245 2250 2255 

ACC TTC GAA GAA GAT GGT GTA GCT ACG GGC AAT CAT GAG TAT TGC GTG 7177 
Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val 
2260 2265 2270 

GAA GTG AAG TAC ACA GCC GGC GTA TCT CCG AAG GTG TGT GTA AAC GTA 7225 
Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Val Asn Val 
2275 2280 2285 

ACT ATT AAT CCG ACT CAG TTC AAT CCT GTA CAG AAC CTG ACG GCA GAA 7273 
Thr He Asn Pro Thr Gin Phe Asn Pro Val Gin Asn Leu Thr Ala Glu 
2290 2295 2300 

CAA GCT CCT AAC AGC ATG GAT GCA ATC CTT AAA TGG AAT GCA CCG GCA 7321 
Gin Ala Pro Asn Ser Met Asp Ala He Leu Lys Trp Asn Ala Pro Ala 
2305 2310 2315 

TCT AAG CGT GCG GAA GTT CTG AAC GAA GAC TTC GAA AAT GGT ATT CCT 7369 
Ser Lys Arg Ala Glu Val Leu Asn Glu Asp Phe Glu Asn Gly He Pro 
232 ° 2325 2330 2335 

TCC TCA TGG AAG ACG ATC GAT GCA GAC GGG GAC GGC AAC AAT TGG ACG 7417 
Ser Ser Trp Lys Thr He Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr 
2340 2345 2350 

ACG ACC CCT CCT CCC GGA GGC TCC TCT TTT GCA GGT CAC AAC AGT GCG 7465 
Thr Thr Pro Pro Pro Gly Gly Ser Ser Phe Ala Gly His Asn Ser Ala 
2355 2360 2365 

ATC TGT GTC TCT TCG GCT TCT TAT ATC AAC TTT GAA GGT CCT CAG AAC 7 513 

He cys Val Ser Ser Ala Ser Tyr He Asn Phe Glu Gly Pro Gin Asn 
2370 2375 2380 
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CCT GAT AAC TAT CTG GTT ACA CCG GAG CTT TCT CTT CCT GGC GGA GGA 7561 
Pro Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Gly Gly Gly 
2385 2390 2395 

ACG CTT ACT TTC TGG GTA TGT GCA CAA GAT GCC AAT TAT GGA TCA GAG 7609 
Thr Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu 
2400 2405 2410 2 415 

CAC TAT GCC GTG TAT GCA TCT TCT ACG GGT AAC GAC GCT TCC AAC TTC 7657 
His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe 
2420 2425 2 430 

GCC AAC GCT TTG TTG GAA GAA GTG CTG ACG GCC AAG ACA GTT GTT ACG 7705 
Ala Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr 
2435 2440 2445 

GCG CCT GAA GCC ATT CGT GGC ACT CGT GTT CAG GGC ACC TGG TAT CAA 7753 
Ala Pro Glu Ala lie Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin 
2450 2455 2460 

AAG ACG GTA CAG TTG CCT GCG GGT ACT AAG TAT GTT GCC TTC CGT CAC 7801 
Lys Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arq His 
2465 2470 2475 

TTC GGC TGT ACG GAC TTC TTC TGG ATC AAC CTT GAT GAT GTT GTA ATC 7849 
Phe Gly Cys Thr Asp Phe Phe Trp lie Asn Leu Asp Asp Val Val lie 
2480 2485 2490 2495 

ACT TCA GGG AAC GCT CCG TCT TAC ACC TAT ACG ATC TAT CGT AAT AAT 7897 
Thr Ser Gly Asn Ala Pro Ser Tyr Thr Tyr Thr He Tyr Arg Asn Asn 
2500 2505 2510 

ACA CAG ATA GCA TCA GGC GTA ACG GAG ACT ACT TAC CGA GAT CCG GAC 7945 
Thr Gin He Ala Ser Gly Val Thr Glu Thr- Thr Tyr Arg Asp Pro Asn 
2515 2520 2525 

TTG GCT ACC GGT TTT TAC ACG TAC GGT GTT AAG GTT GTT TAC CCG AAC 7 993 

Leu Ala Thr Gly Phe Tyr Thr Tyr Gly Val Lys Val Val Tyr Pro Asn 
2530 2535 2540 

GGA GAA TCA GCT ATC GAA ACT GCT ACG TTG AAT ATC ACT TCG TTG GCA 
Gly Glu Ser Ala He Glu Thr Ala Thr Leu Asn He Thr Ser Leu Ala 
2545 2550 2555 

GAC GTA ACG GCT CAG AAG CCT TAC ACG CTG ACA GTT GTA GGA AAG ACG 8089 
Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys Thr 
2560 2565 2570 2575 

ATC ACG GTA ACT TGC CAA GGC GAA GCT ATG ATC TAC GAC ATG AAC GGT 
He Thr Val Thr Cys Gin Gly Glu Ala Met He Tyr Asp Met Asn Gly 
2580 2585 2 590 

CGT CGT CTG GCA GCC GGT CGC AAC ACG GTT GTT TAC ACG GCT CAG GGC 
Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin Glv 
2595 2600 2 605 

GGC CAC TAT GCA GTC ATG GTT GTC GTT GAC GGC AAG TCC TAC GTA GAG 
Gly His Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr Val Glu 
2610 2615 2620 

AAA CTC GCT GTA AAG TAACGAGATG ATTATTTTCG AT CGGT ATG C TCTACCAACC 
Lys Leu Ala Val Lys 
2625 
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GATCGCTTTA ATCGGTCGCC CGGCTTCCAT AAAAAGGAGT CGGGCGACTC TTTTACTCCA 8348 

ACCAAATAAG CATTGTTTTA TAGCCTTTCG GAATATACTC CGGAAGGGGG TCGAGCTACG 8408 

CCCTACAGCG ACTCGGGCTA CGCCGTAGAG CGTACCGAGC T GCGCTCTAC GGCTCTTCGA 84 68 

GCTACGCTGT AGGGCTCACT GCGCCAAGCT CTACGGCTCA GCTCGGCCAC CTCTAC GGCT 8528 

CCCGGAGCGG AACTCTACGG CTCGGCTCGC TACGCTGTAG AGCGTACCTA CGCCGAGCTC 858 B 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 628 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Arg Lys Leu Asn Ser Leu Phe Ser Leu Ala Val Leu Leu Ser Leu 

1 -5 10 15 

Leu Cys Trp Gly Gin Thr Ala Ala Ala Gin Gly Gly Pro Lys Thr Ala 
20 25 30 

Pro Ser Val Thr His Gin Ala Val Gin Lys Gly lie Arg Thr Ser Lys 
35 40 45 

Val Lys Asp Leu Arg Asp Pro lie Pro Ala Gly Met Ala Arg lie lie 
50 55 60 

Leu Glu Ala His Asp Val Trp Glu Asp Gly Thr Gly Tyr Gin Met Leu 
65 70 75 80 

Trp Asp Ala Asp His Asn Gin Tyr Gly Ala Ser lie Pro Glu Glu Ser 
85 90 95 

Phe Trp Phe Ala Asn Gly Thr lie Pro Ala Gly Leu Tyr Asp Pro Phe 
100 105 no 

Glu Tyr Lys Val Pro Val Asn Ala Asp Ala Ser Phe Ser Pro Thr Asn 
115 120 125 

Phe Val Leu Asp Gly Thr Ala Ser Ala Asp He Pro Ala Gly Thr Tvr 
130 135 140 

Asp Tyr Val He He Asn Pro Asn Pro Gly He He Tyr He Val Glv 
145 150 155 160 

Glu Gly Val Ser Lys Gly Asn Asp Tyr Val Val Glu Ala Gly Lys Thr 
165 170 175 

Tyr His Phe Thr Val Gin Arg Gin Gly Pro Gly Asp Ala Ala Ser Val 
180 185 190 

Val Val Thr Gly Glu Gly Gly Asn Glu Phe Ala Pro Val Gin Asn Leu 
195 200 205 

Gin Trp Ser Val Ser Gly Gin Thr Val Thr Leu Thr Trp Gin Ala Pro 
210 215 220 
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Ala Ser Asp Lys Arg Thr Tyr Val Leu Asn Glu Ser Phe Asp Thr Gin 
225 230 235 240 

Thr Leu Pro Asn Gly Trp Thr Met lie Asp Ala Asp Gly Asp Gly His 
245 250 255 

Asn Trp Leu Ser Thr lie Asn Val Tyr Asn Thr Ala Thr His Thr Gly 
260 265 270 

Asp Gly Ala Met Phe Ser Lys Ser Trp Thr Ala Ser Gly Gly Ala Lys 
275 280 285 

lie Asp Leu Ser Pro Asp Asn Tyr Leu Val Thr Pro Lys Val Thr Val 
290 295 300 

Pro Glu Asn Gly Lys Leu Ser Tyr Trp Val Ser Ser Gin Val Pro Trp 
305 310 315 320 

Thr Asn Glu His Tyr Gly Val Phe Leu Ser Thr Thr Gly Asn Glu Ala 
325 330 335 

Ala Asn Phe Thr lie Lys Leu Leu Glu Glu Thr Leu Gly Ser Asp Lys 
340 345 350 

Pro Ala Pro Met Asn Leu Val Lys Ser Glu Gly Val Lys Leu Pro Ala 
355 360 365 

Pro Tyr Gin Glu Arg Thr lie Asp Leu Ser Ala Tyr Ala Gly Gin Gin 
370 375 380 

Val Tyr Leu Ala Phe Arg His Phe Asn Ser Thr Gly lie Phe Arg Leu 
385 390 395 400 

Tyr Leu Asp Asp Val Ala Val Ser Gly Glu Gly Ser Ser Asn Asp Tyr 
405 410 415 

Thr Tyr Thr Val Tyr Arg Asp Asn Val Val lie Ala Gin Asn Leu Ala 
420 425 430 

Ala Thr Thr Phe Asn Gin Glu Asn Val Ala Pro Gly Gin Tyr Asn Tyr 
435 440 445 

Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Lys 
450 455 460 

Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala His Val Gin Asn Leu 
465 470 475 480 

Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro 
485 490 495 

Asn Gly Thr Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser 
500 505 510 

Phe Glu Asn Gly lie Pro Ala Ser Trp Lys Thr He Asp Ala Asp Gly 
515 520 525 

Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser Phe 
530 535 540 

Ala Gly His Asn Ser Ala He Cys Ala Ser Ser Ala Ser Tyr He Asn 
545 550 555 560 
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Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu Leu 
565 570 575 

Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys Ala Gin Asp 
580 585 590 

Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly 
595 €00 605 

Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu Thr 
610 615 620 

Ala Lys Thr Val Val Thr Ala Pro Glu Ala lie Arg Gly Thr Arg Val 
625 630 635 640 

Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr Lys 
645 650 655 

Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp lie Asn 
660 665 670 

Leu Asp Asp Val Glu lie Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr 
675 680 685 

Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp Thr 
690 695 700 

Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser Ser 
705 710 715 720 

Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val Ala 
725 730 735 

Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu lie 
740 745 750 

Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val 
755 760 765 

Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie Ser Lys Thr 
770 775 780 

Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro Asn 
785 790 795 800 

Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala Asp 
805 810 815 

Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu Pro 
820 825 830 

Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp Leu 
835 840 845 

Asn Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly Ser Pro 
850 855 860 

Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie 
865 870 875 880 

Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala Thr 
B85 890 895 
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Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser 
900 905 910 

Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn Pro 
915 920 925 

Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lvs 
930 935 940 

Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro 
945 950 955 9 6 o 

Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser 
965 970 975 

Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr 
980 985 990 

Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His Asn Ser Ala He Cys 
995 1000 1005 

Ala Ser Ser Ala Ser Tyr He Asn Phe Glu Gly Pro Gin Asn Pro Asp 
1010 1015 1020 

Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Asn Gly Gly Thr Leu 
1025 1030 1035 1040 

Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr 
1045 1050 1055 

Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn 
1060 1065 1070 

Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro 
1075 1080 1085 

Glu Ala He Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin Lvs Thr 
1090 1095 iioo 

Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe Glv 
1105 1110 1H5 1120 

Cys Thr Asp Phe Phe Trp He Asn Leu Asp Asp Val Glu He Lys Ala 
1125 H30 H35 

Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His 
1140 H45 H50 

Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp Glv 
1155 H60 H65 

Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr Ala 
H 7 0 H75 1180 

e Gly Gly Thr Asn Val Val Ala Ser Phe Ser T *P Asn Gly Met Ala 
1185 H90 H95 ■ 1200 

Leu Asn Pro Asp Asn Tyr Leu He Ser Lys Asp Val Thr Gly Ala Thr 
1205 1210 1215 

Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His 
1220 1225 1230 
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Tyr Ala Val Met lie Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr 

1235 1240 1245 

Val Val Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala Ara 
1250 1255 1260 

Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trr> 
1265 1270 1275 1280 

He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe 
1285 1290 1295 

Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He 
1300 1305 1310 

Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr 
1315 1320 1325 

Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr 
1330 1335 1340 

Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu 
1345 1350 1355 1360 

Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Glu Cys Val Asn Val Thr 
1365 1370 1375 

Val Asp Pro Val Gin Phe Asn Pro Val Gin Asn Leu Thr Gly Ser Ala 
1380 1385 1390 

Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn Gly Thr Pro 
1395 1400 1405 

Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser 
1410 1415 1420 

Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr He Asp Ala Asp Gly 
1425 1430 1435 1440 

Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser Phe 
1445 1450 1455 

Ala Gly His Asn Ser Ala He Cys Ala Ser Ser Ala Ser Tyr He Asn 
1460 1465 1470 

Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu Leu 
1475 1480 1485 

Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys Ala Gin Asn 
1490 1495 150 0 

Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly 
1505 1510 1515 1520 

Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu Thr 
152 5 1530 1535 

Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Arg Gly Thr Arg Val 
i5 40 1545 1550 

Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr Lys 
1555 1560 i5 6 5 J J 
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Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp lie Asn 
1570 1575 1580 

Leu Asp Asp Val Glu lie Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr 
1585 1590 1595 1600 

Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp Thr 
1605 1610 1615 

Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser Ser 
1620 1625 1630 

Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn Val Val Ala 
1635 1640 1645 

Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu lie 
1650 1655 1660 

Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val 
1665 1670 1675 1680 

Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie Ser Lys Thr 
1665 1690 1695 

Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro Asn 
1700 1705 1710 

Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala Asp 
1715 1720 1725 

Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu Pro 
1730 1735 1740 

Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp Leu 
1745 1750 1755 1760 

Asn Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly Ser Pro 
1765 1770 1775 

Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie 
1780 1785 1790 

Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala Thr 
1795 1800 1805 

Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser 
1810 1815 1820 

Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn Pro 
1825 1830 1835 1840 

Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys 
1845 1850 1855 

Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro 
1860 1865 1870 

Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser 
1875 1880 1885 

Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr 
1890 1895 1900 
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Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His Asn Ser Ala lie Cys 
1905 1910 1915 1920 

Val Ser Ser Ala Ser Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp 
1925 1930 1935 

Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr Leu 
1940 1945 1950 

Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr 
1955 1960 1965 

Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn 
1970 1975 1980 

Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro 
1985 1990 1995 2000 

Glu Ala He Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr 
2005 2010 2015 

Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly 
2020 2025 2030 

Cys Thr Asp Phe Phe Trp He Asn Leu Asp Glu Val Glu He Lys Ala 
2035 2040 2045 

Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His 
2050 2055 2060 

Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly 
2065 2070 2075 2080 

Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala 
2085 2090 2095 

His Gly Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala 
2100 2105 2110 

Leu Asn Pro Asp Asn Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr 
2115 2120 2125 

Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His 
2130 2135 2140 

Tyr Ala Val Met He Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr 
2145 2150 2155 2160 

Val Val Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg 
2165 2170 2175 

Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp 
2180 2185 2190 

He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe 
2195 2200 2205 

Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He 
2210 2215 2220 

Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr 
2 225 2230 2235 2240 
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Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr 
2245 2250 2255 

Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu 
2260 2265 2270 

Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Val Asn Val Thr 
2275 2280 2285 

lie Asn Pro Thr Gin Phe Asn Pro Val Gin Asn Leu Thr Ala Glu Gin 
2290 2295 2300 

Ala Pro Asn Ser Met Asp Ala lie Leu Lys Trp Asn Ala Pro Ala Ser 
2305 2310 2315 2320 

Lys Arg Ala Glu Val Leu Asn Glu Asp Phe Glu Asn Gly lie Pro Ser 
2325 2330 2335 

Ser Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr 
2340 2345 2350 

Thr Pro Pro Pro Gly Gly Ser Ser Phe Ala Gly His Asn Ser Ala lie 
2355 2360 2365 

Cys Val Ser Ser Ala Ser Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro 
2370 2375 2380 

Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr 
2385 2390 2395 2400 

Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His 
2405 2410 2415 

Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala 
2420 2425 2430 

Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr Val Val Thr Ala 
2435 2440 2445 

Pro Glu Ala He Arg Gly Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys 
2450 2455 2460 

Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Phe 
2465 2470 2475 2480 

Gly Cys Thr Asp Phe Phe Trp He Asn Leu Asp Asp Val Val He Thr 
2485 2490 2495 

Ser Gly Asn Ala Pro Ser Tyr Thr Tyr Thr He Tyr Arg Asn Asn Thr 
2500 2505 2510 

Gin He Ala Ser Gly Val Thr Glu Thr Thr Tyr Arg Asp Pro Asp Leu 
2515 2520 2525 

Ala Thr Gly Phe Tyr Thr Tyr Gly Val Lys Val Val Tyr Pro Asn Gly 
2530 2535 2540 

Glu Ser Ala He Glu Thr Ala Thr Leu Asn He Thr Ser Leu Ala Asp 
2545 2550 2555 2560 

Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys Thr He 
2565 2570 2575 
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Thr Val Thr Cys Gin Gly Glu Ala Met lie Tyr Asp Met Asn Gly Ara 
2580 2585 2590 

Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin Gly Glv 
2595 2600 2605 

His Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr Val Glu Lvs 
2610 2615 2620 

Leu Ala Val Lys 
2625 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1350 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CCG AAT CCG AAT CCC GGA ACA ACA ACA CTT TCC GAA TCA TTC GAA AAT 4B 
Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn 
2630 2635 2640 

GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT GCA GAC GGT GAC GGC AAC 96 
Gly He Pro Ala Ser Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn 
2645 26 50 2655 2660 

AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC ACC TCT TTT GCA GGT CAC 
Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His 
2665 2670 2675 

AAC AGT GCA ATC TGT GCC TCT TCG GCT TCT TAT ATC AAC TTT GAA GGT 
Asn Ser Ala He Cys Ala Ser Ser Ala Ser Tyr He Asn Phe Glu Gly 
2680 2685 2 690 

CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA CCG GAG CTA TCT CTT CCT 
Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro 
2695 2700 2705 

AAC GGA GGA ACG CTT ACT TTC TGG GTA TGT GCA CAA GAT GCC AAT TAT 

Asn Gly Thr Leu Thr Pne Tr P Val c ys Ala Gin Asp Ala Asn Tyr 

2710 2715 2720 

GCA TCA GAG CAC TAT GCC GTG TAC GCA TCT TCT ACG GGT AAC GAC GCT 
Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala 
25 2730 2735 F 2740 

J~ ^ C ll C ^ GCT TTG TTG m GTG CTG ACG G CC AAG ACA 

ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr 
2745 2750 2 755 



144 



192 



240 



288 



336 



384 
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GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC ACT CGT GTT GAG GGC ACC 432 
Val Val Thr Ala Pro Glu Ala He Arg Gly Thr Arg Val Gin Gly Thr 
2760 2765 2770 

TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG GGT ACT AAG TAT GTT GCT 48 0 

Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala 
2775 2780 2785 

TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC TGG ATT AAC CTT GAT GAT 52 8 

Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp He Asn Leu Asp Asp 
2790 2795 2800 

GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA GAC TTC ACG GAA ACG TTC 576 
Val Glu He Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe 
2805 2810 2815 2820 

GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG GAA TGG ACT ACT ATC GAT 624 
Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp 
2825 2830 2835 

GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT CTG TCT TCC GGA CAA TTG 672 
Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu 
2840 2845 2850 

GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA GTA GCC TCT TTC TCA 720 
Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser 
2855 2860 2865 

TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT CTC ATC TCA AAG GAT 768 
Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu He Ser Lys Asp 
2870 2875 2880 

GTT ACA GGC GCA ACT AAG GTA AAG TAC TAC TAT GCA GTC AAC GAC GGT 816 
Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly 
2885 2890 2895 2900 

TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC AAG ACG GGC ACG AAC 864 
Phe Pro Gly Asp His Tyr Ala Val Met He Ser Lys Thr Gly Thr Asn 
2905 2910 2915 

GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG CCT AAC GGA ATA AAT 912 
Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro Asn Gly He Asn 
2920 2925 2930 

AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA GCC GAT GGC GCC AAA 960 
Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys 
2935 2940 2945 

CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA GAT TTG CCT GCG GGT ACT 1008 
Pro Gin Ser Val Trp He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr 
2950 2955 2960 

AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC TCG GAT TTG AAC TAC ATT 1056 
Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr He 
2965 2970 2975 2980 

CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC AGC CCC ACC CCG ACC 1104 
Leu Leu Asp Asp He Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr 
2985 2990 2995 

GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT ACG AAG ATC AAG GAA GGT 1152 
Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly 
3000 3005 3010 
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CTG ACC GAA ACG ACC TTC GAA GAA GAC GGT GTA GCT ACG GGC AAC CAT 1200 
Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His 
3015 3020 3025 

GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC GTA TCT CCG AAA GAG 1248 
Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Glu 
3030 3035 3040 

TGT GTA AAC GTA ACT GTT GAT CCT GTG CAG TTC AAT CCT GTA CAG AAC 1296 
Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn Pro Val Gin Asn 
3045 3050 3055 3060 

CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTT AAG TGG GAT GCA 1344 
Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala 
3065 3070 3075 

CCT AAT 1350 
Pro Asn 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn 
15 10 15 

Gly lie Pro Ala Ser Trp Lys Thr lie Asp Ala Asp Gly Asp Gly Asn 
20 25 30 

Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly Thr Ser Phe Ala Gly His 
35 40 45 

Asn Ser Ala lie Cys Ala Ser Ser Ala Ser Tyr lie Asn Phe Glu Gly 
50 55 60 

Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr Pro Glu Leu Ser Leu Pro 
fi 5 70 75 80 

Asn Gly Gly Thr Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr 
85 90 95 

Ala Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala 
100 105 no 

Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu Val Leu Thr Ala Lys Thr 
. 115 120 125 

Val Val Thr Ala Pro Glu Ala lie Arg Gly Thr Arg Val Gin Gly Thr 
130 135 140 

Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala Gly Thr Lys Tyr Val Ala 
I 45 150 155 160 
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Phe Arg His Phe Gly Cys Thr Asp Phe Phe Trp lie Asn Leu Asp Asp 
165 170 175 

Val Glu lie Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe 
180 185 190 

Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp Thr Thr lie Asp 
195 200 205 

Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys Leu Ser Ser Gly Gin Leu 
210 215 220 

Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser 
225 230 235 240 

Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu lie Ser Lys Asp 
245 250 255 

Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly 
260 265 270 

Phe Pro Gly Asp His Tyr Ala Val Met lie Ser Lys Thr Gly Thr Asn 
275 280 285 

Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro Asn Gly He Asn 
290 295 300 

Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala Asp Gly Ala Lys 
305 310 315 320 

Pro Gin Ser Val Trp He Glu Arg Thr Val Asp Leu Pro Ala Gly Thr 
325 330 335 

Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp Leu Asn Tyr lie 
340 345 350 

Leu Leu Asp Asp He Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr 
355 360 365 

Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly 
370 375 380 

Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His 
385 390 395 400 

Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Glu 
405 410 415 

Cys Val Asn Val Thr Val Asp Pro Val Gin Phe Asn Pro Val Gin Asn 
420 425 430 

Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala 
435 440 445 

Pro Asn 
450 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1368 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 



(ii) MOLECULE TYPE: DMA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1368 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GGT ACC CCG AAT CCA AAT CCA AAT CCG AAT CCG GGA ACA ACA ACA CTT 48 
Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
455 460 465 

TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT 96 
Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser Trp Lys Thr He Asp 
470 475 480 

GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC 144 
Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
485 490 495 

ACC TCT TTT GCA GGT CAC AAC AGT GCG ATC TGT GCC TCT TCG GCT TCT 192 
Thr Ser Phe Ala Gly His Asn Ser Ala He Cys Ala Ser Ser Ala Ser 
500 505 510 

TAT ATC AAC TTT GAA GGC CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA 240 
Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
515 520 525 530 

CCG GAG CTA TCT CTT CCT AAC GGA GGA ACG CTT ACT TTC TGG GTA TGT 288 
Pro Glu Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys 
535 540 545 

GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAT GCA TCT 336 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
550 555 560 

TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA 384 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
565 570 575 

GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC 432 
Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Arg Gly 
580 585 590 



ACT CGT GTT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG 
Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
595 600 605 610 



480 



GGT ACT AAG TAT GTT GCT TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC 528 
Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
€15 620 625 

TGG ATC AAC CTT GAT GAT GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA 57 6 

Trp He Asn Leu Asp Asp Val Glu He Lys Ala Asn Gly Lys Arg Ala 
630 635 640 

GAC TTC ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG 624 
Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
"5 650 655 
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GAA TGG ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT 672 
Glu Trp Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
660 665 670 

CTG TCT TCC GGA CAA TTG GGC TGG CTG ACA GCT CAT GGC GGC ACC AAC 720 
Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn 
675 680 685 €90 

GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC 768 
Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
695 700 705 

TAT CTC ATC TCA AAG GAT GTT ACA GGC GGA ACT AAG GTA AAG TAC TAC 816 
Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr 
710 715 720 

TAT GGA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC 864 
Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He 
725 730 735 

TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA 912 
Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
740 745 750 

ACG CCT AAC GGA ATA AAT AAG GGC GGA GGA AGA TTC GGT CTT TCC ACG 960 
Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
755 760 765 770 

GAA GCC GAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA 1008 
Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val 
775 780 785 

GAT TTG CCT GCG GGT ACT AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC 1056 
Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
790 795 800 

TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT 1104 
Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly 
805 810 815 

GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT 1152 
Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
820 825 830 

ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGT 1200 
Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
835 840 845 850 

GTA GCT ACG GGC AAC CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC 1248 
Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala 
855 860 865 

GGC GTA TCT CCG AAA GAG TGT GTA AAC GTA ACT GTT GAT CCT GTG CAG 1296 
Gly Val Ser Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin 
870 875 880 

TTC AAT CCT GTA CAG AAC CTG ACC GGT AGT GGA GTC GGC CAG AAA GTA 1344 
Phe Asn Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val 
885 890 895 

ACG CTT AAG TGG GAT GCA CCT AAT 13SB 
Thr Leu Lys Trp Asp Ala Pro Asn 
900 905 
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(2) INFORMATION FOR SEQ ID NO: IB: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 456 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
15 10 15 

Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser Trp Lys Thr lie Asp 
20 25 30 

Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
35 40 45 

Thr Ser Phe Ala Gly His Asn Ser Ala lie Cys Ala Ser Ser Ala Ser 
50 55 60 

Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
65 70 75 80 

Pro Glu Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cys 
85 90 95 

Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
100 105 110 

Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
115 120 125 

Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala lie Arq Glv 
130 135 140 

Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
145 150 155 160 

Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
165 170 175 

Trp He Asn Leu Asp Asp Val Glu He Lys Ala Asn Gly Lys Arg Ala 
180 185 190 

Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
195 200 205 

Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
210 215 220 

Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn 
225 230 235 240 

Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
245 250 255 

Tyr Leu He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr 
260 265 270 
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Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie 
275 280 285 

Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
290 295 300 

Thr Pro Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
305 310 315 320 

Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val 
325 330 335 

Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
340 345 350 

Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly 
355 360 365 

Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
370 375 380 

Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
385 390 395 400 

Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala 
405 410 415 

Gly Val Ser Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin 
420 425 430 

Phe Asn Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val 
435 440 445 

Thr Leu Lys Trp Asp Ala Pro Asn 
450 455 

(2) INFORMATION FOR SEQ ID NO; 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1368 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.-1368 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

GGT ACC CCG AAT CCA AAT CCA AAT CCG AAT CCG GGA ACA ACA ACA CTT 4 8 

Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
460 465 470 

TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT 96 
Ser Glu Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr He Asp 
475 480 485 
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GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC 144 
Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
490 495 500 

ACC TCT TTT GCA GGT CAC AAC AGT GCG ATC TGT GCC TCT TCG GCT TCT 192 
Thr Ser Phe Ala Gly His Asn Ser Ala lie Cys Ala Ser Ser Ala Ser 
505 510 515 520 

TAT ATC AAC TTT GAA GGC CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA 240 
Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
525 530 535 

CCG GAG CTA TCT CTT CCT AAC GGA GGA ACG CTT ACT TTC TGG GTA TGT 
Pro Glu Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp Val Cvs 
540 545 550 

GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAT GCA TCT 336 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
555 560 565 

TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
570 575 580 

GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC 432 
Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Ara Glv 
585 590 595 * 6 00 



286 



384 



480 



624 



ACT CGT GTT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG 
Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
605 610 615 

GGT ACT AAG TAT GTT GCT TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC 528 
Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
620 625 630 

TGG ATC AAC CTT GAT GAT GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA 576 
Trp He Asn Leu Asp Asp Val Glu He Lys Ala Asn Gly Lys Arg Ala 
635 640 645 

GAC TTC ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG 
Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
650 655 660 

GAA TGG ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT 672 
Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
665 670 675 680 

CTG TCT TCC GGA CAA TTG GGC TGG CTG ACA GCT CAT GGC GGC ACC AAC 
Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn 
685 690 695 

GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC 
Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
700 705 710 

TAT CTC ATC TCA AAG GAT GTT ACA GGC GCA ACT AAG GTA AAG TAC TAC 
Tyr Leu He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tvr 
715 720 725 

TAT GCA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC 
Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He 
730 735 740 



720 



768 



816 



864 



3NSDOCID: <WO. 



„9617936A2_1_> 



WO 96/17936 



PCIYUS95/16108 



110 

TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC ACQ GTT GTT TTC GAA GAA 912 
IS JJ! ttr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
745 750 755 760 



ACG CCT AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GOT CTT TCC ACG 960 
Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
765 770 "75 

GAA GCC GAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA 1008 
Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val 
780 785 790 

GAT TTG CCT GCG GGT ACT AAG TAT GTT GCT TTC CGA CAC TAC AAT TGC 1056 
Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
795 800 805 

TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT 1104 
Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly 
810 815 820 

GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT 1152 
Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
825 830 835 840 



ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGT 1200 
Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
845 850 

GTA GCT ACG GGC AAC CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC 1248 
Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala 
860 865 870 

GGC GTA TCT CCG AAA GAG TGT GTA AAC GTA ACT GTT GAT CCT GTG CAG 1296 
Gly Val ser Pro Lys Glu Cys Val Asn Val Thr Val Asp Pro Val Gin 
875 880 885 

TTC AAT CCT GTA CAG AAC CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA 1344 
Phe Asn Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val 
890 895 900 

ACG CTT AAG TGG GAT GCA CCT AAT 1368 
Thr Leu Lys Trp Asp Ala Pro Asn 
905 910 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 456 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Glv Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
15 10 15 

Ser Glu Ser Phe Glu Asn Gly He Pro Ala ser Trp Lys Thr He Asp 
20 25 30 
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Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
35 40 45 

Thr Ser Phe Ala Gly His Asn Ser Ala lie Cys Ala Ser Ser Ala Ser 
50 55 60 

Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
65 70 75 80 

Pro Glu Leu Ser Leu Pro Asn Gly Gly Thr Leu Thr Phe Trp val Cys 
85 90 95 

Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
100 105 110 

Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
115 120 125 

Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala lie Arg Gly 
130 135 140 

Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
145 150 155 160 

Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
165 170 175 

Trp lie Asn Leu Asp Asp Val Glu lie Lys Ala Asn Gly Lys Arg Ala 
180 185 190 

Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
195 200 205 

Glu Trp Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
210 215 220 

Leu Ser Ser Gly Gin Leu Gly Trp Leu Thr Ala His Gly Gly Thr Asn 
225 230 235 240 

Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
245 250 255 

Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr 
260 265 270 

Tyr Ala Val Asn Asp Gly .Phe Pro Gly Asp His Tyr Ala Val Met lie 
275 280 285 

Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
290 295 300 

Thr Pro Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
305 310 315 320 

Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val 
325 330 335 

Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
340 345 350 

Ser Asp Leu Asn Tyr lie Leu Leu Asp Asp He Gin Phe Thr Met Gly 
355 360 365 
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370 375 

Thr Lys lie Lys Glu Gly Leu Thr 

385 390 

Val Ala Thr Gly Asn His Glu Tyr 
405 

Gly Val Ser Pro Lys Glu Cys Val 
420 

Phe Asn Pro Val Gin Asn Leu Thr 

435 440 

Thr Leu Lys Trp Asp Ala Pro Asn 

450 455 



112 



Thr Tyr Thr Val Tyr Arg Asp Gly 
380 

Glu Thr Thr Phe Glu Glu Asp Gly 
395 400 

Cys Val Glu Val Lys Tyr Thr Ala 
410 415 

Asn Val Thr Val Asp Pro Val Gin 
425 430 

Gly Ser Ala Val Gly Gin Lys Val 
445 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1318 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1318 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GGT ACC CCG AAT CCA AAT CCA AAT CCG AAT CCG GGA ACA ACA ACA CTT 48 
Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
460 465 470 

TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG ATC GAT 96 
Ser Glu Ser Phe Glu Asn Gly lie Pro Ala Ser Trp Lys Thr lie Asp 
475 480 485 

GCA GAC GGT GAC GGC AAC AAT TGG ACG ACG ACC CCT CCT CCC GGA GGC 144 
Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
490 495 500 

ACC TCT TTT GCA GGT CAC AAC AGT GCG ATC TGT GTC TCT TCG GCT TCT 192 
Thr Ser Phe Ala Gly His Asn Ser Ala lie Cys Val Ser Ser Ala Ser 
505 510 515 520 

TAT ATC AAC TTT GAA GGC CCT CAG AAC CCT GAT AAC TAT CTG GTT ACA 240 
Tyr lie Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
525 530 535 

CCG GAG CTA TCT CTT CCT GGC GGA GGA ACG CTT ACT TTC TGG GTA TGT 2 88 

Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr Leu Thr Phe Trp Val Cys 
540 545 550 

GCA CAA GAT GCC AAT TAT GCA TCA GAG CAC TAT GCC GTG TAT GCA TCT 336 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
555 560 565 
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TCT ACG GGT AAC GAC GCT TCC AAC TTC GCC AAC GCT TTG TTG GAA GAA 384 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
570 575 580 

GTG CTG ACG GCC AAG ACA GTT GTT ACG GCA CCT GAA GCC ATT CGT GGC 432 
Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala lie Arg Gly 
585 590 595 600 

ACT CGT GTT CAG GGC ACC TGG TAT CAA AAG ACG GTA CAG TTG CCT GCG 480 
Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
605 610 615 

GGT ACT AAG TAT GTT GCC TTC CGT CAC TTC GGC TGT ACG GAC TTC TTC 528 
Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
620 625 630 

TGG ATC AAC CTT GAT GAA GTT GAG ATC AAG GCC AAC GGC AAG CGC GCA 576 
Trp lie Asn Leu Asp Glu Val Glu lie Lys Ala Asn Gly Lys Arg Ala 
635 640 645 

GAC TTC ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCG GCG 624 
Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
650 655 660 

GAA TGG ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GGT TGG CTC TGT 672 
Glu Trp Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
665 670 675 680 

CTG TCT TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC 720 
Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn 
685 690 695 

GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC 768 
Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
700 705 710 

TAT CTC ATC TCA AAG GAT GTT ACA GGC GCA ACT AAG GTA AAG TAC TAC 816 
Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr 
715 720 725 

TAT GCA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC 864 
Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie 
730 735 740 

TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA 912 
Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
745 750 755 760 

ACG CCT AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG 960 
Thr Pro Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
765 770 775 

GAA GCC GAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG CGT ACG GTA 1008 
Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val 
780 785 790 

GAT TTG CCT GCG GGC ACG AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC 1056 
Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
795 800 805 

TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT 1104 
Ser Asp Leu Asn Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met Gly 
810 815 820 
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GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT CGT GAC GGT 1152 
Glv Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
Q2 l 830 835 840 

ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAT GGT 1200 
Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
845 850 855 

GTA GCT ACG GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC 1248 
Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala 
860 865 870 

GGC GTA TCT CCG AAG GTG TGT GTA AAC GTA ACT ATT AAT CCG ACT CAG 1296 
Gly val Ser Pro Lys Val Cys Val Asn Val Thr lie Asn Pro Thr Gin 
875 880 885 



TTC AAT CCT GTA CAG AAC CTG A 
Phe Asn Pro Val Gin Asn Leu 
890 895 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 439 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Glv Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu 
1 5 10 15 

Ser Glu Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr He Asp 
20 25 30 

Ala Asp Gly Asp Gly Asn Asn Trp Thr Thr Thr Pro Pro Pro Gly Gly 
35 40 45 

Thr Ser Phe Ala Gly His Asn Ser Ala He Cys Val Ser Ser Ala Ser 
50 55 60 

Tvr He Asn Phe Glu Gly Pro Gin Asn Pro Asp Asn Tyr Leu Val Thr 
65 70 75 80 

Pro Glu Leu Ser Leu Pro Gly Gly Gly Thr Leu Thr Phe Trp Val Cys 
85 90 95 

Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
100 105 HO 

Ser Thr Gly Asn Asp Ala Ser Asn Phe Ala Asn Ala Leu Leu Glu Glu 
115 120 125 

Val Leu Thr Ala Lys Thr Val Val Thr Ala Pro Glu Ala He Arg Gly 
130 135 140 

Thr Arg Val Gin Gly Thr Trp Tyr Gin Lys Thr Val Gin Leu Pro Ala 
145 150 155 160 



1318 
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Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gly Cys Thr Asp Phe Phe 
165 170 175 

Trp lie Asn Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala 
180 185 190 

Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
195 200 205 

Glu Trp Thr Thr He Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
210 215 220 

Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn 
225 230 235 240 

Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
245 250 255 

Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr 
260 265 270 

Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He 
275 280 285 

Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
290 295 300 

Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
305 310 315 320 

Glu Ala Asp Gly Ala Lys Pro Gin Ser Val Trp He Glu Arg Thr Val 
325 330 335 

Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
340 345 350 

Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe Thr Met Gly 
355 360 365 

Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
370 375 380 

Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
385 390 395 400 

Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala 
405 410 415 

Gly Val Ser Pro Lys Val Cys Val Asn Val Thr He Asn Pro Thr Gin 
420 425 430 

Phe Asn Pro Val Gin Asn Leu 
435 



(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DMA < genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GGCAAACCAA AAAGATTC 18 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

TTCTTCCAAC GACTACAC 18 

(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 6241 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 696. .1787 

(D) OTHER INFORMATION: /product^ "hagD protease" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1790. .5866 

(D) OTHER INFORMATION: /product** "hagD hemagglutinin" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
GGAT CCTACG CCCGATACCC ATACTC GAAG CCTTTGCTCA GTACCATCCT GCAGAAGTTC 60 
ACTCTTTCGC AT AT AGT GAC CCTCTTTTCT CTCAGCATAA T GGTACCT AT CATATCAGTA 120 
AGGGGCATAT TGTCTTTTCG AACAAT GTAC AGCCCGAGAA CTCTTTACTT CCACATCACA 180 
CCCCCGACTC CTTAGTCAAG GATCTTTTTT CCCCTTTCCC CTCCGCTCTC TTCCTCATGC 240 
TGGACTGACT TAACCTTGGT CTGCTCTACT TTTCGGTTGT AAATACATGC AATACAATAA 300 
CTTTAAGTGT TGTTAGACAA CACTTTTACA AGACTCTGAC TTTTAAT GAG GTGGAGCATG 360 
AACCTTTTCC T CTTT CAT CT TCTCATTCAG ATTATAGTCA ATATTTTAGT AAAAGGCTAA 420 
TTGACAGCCT TTTATAAGGG TTAATCCCTT GT CGCTTAT A TTGAAAACAT GTTCTTTATA 480 
AT C C GAT ACT CTTCTTAAAT CGAATTTTTT CT CTAAATT G CGCCGCAACA AAACTCCTTG 540 
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AGAAAAGTAC CAATAGAAAT AGAAGGTAGC ATTTTGCCTT TAAATTCCTT TTCTTTTCTT 600 

GGATTGTTCT TGAAATGAAT CTTATTTGTG GATCTTTTTT GTTTTTTTAA CCCGGCCGTG 660 

GTTCTCTGAA T CAC GACCAT AAATTGTTTT AAAGT ATG AGG AAA TTA TTA TTG 713 

Met Arg Lys Leu Leu Leu 
440 445 

CTG ATC GCG GCG TCC CTT TTG GGA GTT GGT CTT TAC GCC CAA AGC GCC 761 
Leu He Ala Ala Ser Leu Leu Gly Val Gly Leu Tyr Ala Gin Ser Ala 
450 455 460 

AAG ATT AAG CTT GAT GCT CCG ACT ACT CGA ACG ACA TGT ACG AAC AAT 809 
Lys He Lys Leu Asp Ala Pro Thr Thr Arg Thr Thr Cys Thr Asn Asn 
465 470 475 

AGC TTC AAG CAG TTC GAT GCA AGC TTT TCG TTC AAT GAA GTC GAG CTG 857 
Ser Phe Lys Gin Phe Asp Ala Ser Phe Ser Phe Asn Glu Val Glu Leu 
460 485 490 

ACA AAG GTG GAG ACC AAA GGT GGT ACT TTC GCC TCA GTG TCA ATT CCG 905 
Thr Lys Val Glu Thr Lys Gly Gly Thr Phe Ala Ser Val Ser He Pro 
495 500 505 

GGT GCA TTC CCG ACC GGT GAG GTT GGT TCT CCC GAA GTG CCA GCA GTT 953 
Gly Ala Phe Pro Thr Gly Glu Val Gly Ser Pro Glu Val Pro Ala Val 
510 515 520 525 

AGG AAG TTG ATT GCT GTG CCT GTC GGA GCC ACA CCT GTT GTT CGC GTG 1001 
Arg Lys Leu lie Ala Val Pro Val Gly Ala Thr Pro Val Val Arg Val 
530 535 540 

AAA AGT TTT ACC GAG CAA GTT TAC TCT CTG AAC CAA TAC GGT TCC GAA 1049 
Lys Ser Phe Thr Glu Gin Val Tyr Ser Leu Asn Gin Tyr Gly Ser Glu 
545 550 555 

AAA CTC ATG CCA CAT CAA CCC TCT ATG AGC AAG AGT GAT GAT CCC GAA 1097 
Lys Leu Met Pro His Gin Pro Ser Met Ser Lys Ser Asp Asp Pro Glu 
560 565 570 

AAG GTT CCC TTC GTT TAC AAT GCT GCT GCT TAT GCA CGC AAA GGT TTT 1145 
Lys Val Pro Phe Val Tyr Asn Ala Ala Ala Tyr Ala Arg Lys Gly Phe 
575 580 585 

GTC GGA CAA GAA CTG ACC CAA GTA GAA ATG TTG GGG ACA ATG CGT GGT 1193 
Val Gly Gin Glu Leu Thr Gin Val Glu Met Leu Gly Thr Met Arg Gly 
590 595 600 * 605 

GTT CGC ATT GCA GCT CTT ACC ATT AAT CCT GTT CAG TAT GAT GTG GTT 1241 
Val Arg He Ala Ala Leu Thr He Asn Pro Val Gin Tyr Asp Val Val 
610 615 620 

GCA AAC CAA TTG AAG GTT AGA AAC AAC ATC GAA ATT GAA GTA AGC TTT 1289 
Ala Asn Gin Leu Lys Val Arg Asn Asn He Glu He Glu Val Ser Phe 
"5 630 635 

CAA GGA GCT GAT GAA GTA GCT ACA CAA CGT TTG TAT GAT GCT TCT TTT 1337 
Gin Gly Ala Asp Glu Val Ala Thr Gin Arg Leu Tyr Asp Ala Ser Phe 
6 40 645 650 

AGC CCT TAT TTC GAA ACA GCT TAT AAA CAG CTC TTC AAT AGA GAT GTT 1385 
Ser Pro Tyr Phe Glu Thr Ala Tyr Lys Gin Leu Phe Asn Arg Asp Val 
655 660 665 
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TAT ACA GAT CAT GGC GAC TTG TAT AAT ACG CCG GTT CGT ATG CTT GTT 1433 
Tyr Thr Asp His Gly Asp Leu Tyr Asn Thr Pro Val Arg Met Leu Val 
670 675 680 685 

GTT GCA GGT GCA AAA TTC AAA GAA GCT CTC AAG CCT TGG CTC ACT TGG 14 81 

Val Ala Gly Ala Lys Phe Lys Glu Ala Leu Lys Pro Trp Leu Thr Trp 
690 695 700 

AAG GCT CAA AAG GGC TTC TAT CTG GAT GTG CAT TAC ACA GAC GAA GCT 1529 
Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val His Tyr Thr Asp Glu Ala 
705 710 715 

GAA GTA GGA ACG ACA AAC GCC TCT ATC AAG GCA TTT ATT CAC AAG AAA 1577 
Glu Val Gly Thr Thr Asn Ala Ser lie Lys Ala Phe lie His Lys Lys 
720 725 730 

TAC AAT GAT GGA TTG GCA GCT AGT GCT GCT CCG GTC TTC TTG GCT TTG 1625 
Tyr Asn Asp Gly Leu Ala Ala Ser Ala Ala Pro Val Phe Leu Ala Leu 
735 740 745 

GTT GGT GAC ACT GAC GTT ATT AGC GGA GAA AAA GGA AAG AAA ACA AAA 1673 
Val Gly Asp Thr Asp Val lie Ser Gly Glu Lys Gly Lys Lys Thr Lys 
750 755 760 765 

AAA GTT ACC GAC TTG TAT TAC AGT GCA GTC GAT GGC GAC TAT TTC CCT 1721 
Lys Val Thr Asp Leu Tyr Tyr Ser Ala Val Asp Gly Asp Tyr Phe Pro 
770 775 780 

GAA ATG TAT ACT TTC CGT ATG TCT GCT TCT TCC CCA GAA GAA CTG ACG 1769 
Glu Met Tyr Thr Phe Arg Met Ser Ala Ser Ser Pro Glu Glu Leu Thr 
785 790 795 

AAC ATC ATT GAT AAG TAT TG ATG TAT GAA AAG GCT ACC ATG CCG GAT 1816 
Asn lie lie Asp Lys Tyr Met Tyr Glu Lys Ala Thr Met Pro Asp 
800 1 5 

AAG AGC TAT TTG GAA AAG GCC CTC TTG ATT GCC GGT GCT GAC TCC TAC 1864 
Lys Ser Tyr Leu Glu Lys Ala Leu Leu lie Ala Gly Ala Asp Ser Tyr 
10 15 20 25 

TGG AAT CCT AAG ATA GGC CAG CAA ACC ATC AAA TAT GCT GTA CAG TAT 1912 
Trp Asn Pro Lys lie Gly Gin Gin Thr lie Lys Tyr Ala Val Gin Tyr 
30 35 40 

TAC TAC AAT CAA GAT CAT GGC TAT ACA GAT GTG TAC AGT TAC CCT AAA 1960 
Tyr Tyr Asn Gin Asp His Gly Tyr Thr Asp Val Tyr Ser Tyr Pro Lys 
45 50 55 

GCT CCT TAT ACA GGC TGC TAT AGT CAC TTG AAT ACC GGT GTC GGC TTT 2 008 

Ala Pro Tyr Thr Gly Cys Tyr Ser His Leu Asn Thr Gly Val Gly Phe 
60 65 70 

GCC AAC TAT ACA GCG CAT GGA TCT GAG ACA TCA TGG GCA GAT CCG TCG 2056 
Ala Asn Tyr Thr Ala His Gly Ser Glu Thr Ser Trp Ala Asp Pro Ser 
75 80 85 

CTG ACC GCC ACT CAA GTG AAA GCA CTC ACA AAT AAG GAC AAA TAC TTC 2104 
Leu Thr Ala Thr Gin Val Lys Ala Leu Thr Asn Lys Asp Lys Tyr Phe 
90 95 100 105 

TTA GCT ATT GGG AAC TGC TGT GTT ACA GCT CAA TTC GAT TAT CCA CAG 2152 
Leu Ala lie Gly Asn Cys Cys Val Thr Ala Gin Phe Asp Tyr Pro Gin 
110 115 120 
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CCT TGC TTT GGA GAG GTA ATG ACT CGT GTC AAG GAG AAA GGT GCT TAT 2200 
Pro Cys Phe Gly Glu Val Met Thr Arg Val Lys Glu Lys Gly Ala Tyr 
125 130 135 

GCC TAT ATC GGT TCA TCT CCG AAT TCT TAT TGG GGC GAG GAC TAC TAT 224 8 

Ala Tyr He Gly Ser Ser Pro Asn Ser Tyr Trp Gly Glu Asp Tyr Tyr 
140 145 150 

TGG AGT GTC GGT GCT AAT GCC GTA TTT GGT GTT CAG CCT ACT TTT GAA 2296 
Trp Ser Val Gly Ala Asn Ala Val Phe Gly Val Gin Pro Thr Phe Glu 
155 160 165 

GGT ACG TCT ATG GGT TCT TAT GAT GCT ACA TTC TTG GAA GAT TCG TAC 2344 
Gly Thr Ser Met Gly Ser Tyr Asp Ala Thr Phe Leu Glu Asp Ser Tyr 
170 175 180 IBS 

AAC ACA GTG AAT TCT ATT ATG TGG GCA GGT AAT CTT GCC GCT ACT CAT 2392 
Asn Thr Val Asn Ser He Met Trp Ala Gly Asn Leu Ala Ala Thr His 
190 195 200 

GCT GGA AAT ATC GGC AAT ATT ACC CAT ATC GGT GCT CAT TAC TAT TGG 2440 
Ala Gly Asn He Gly Asn He Thr His He Gly Ala His Tyr Tyr Trp 
205 210 215 

GAA GCT TAT CAT GTC CTT GGC GAT GGT TCG GTT ATG CCT TAT CGT GCA 2488 
Glu Ala Tyr His Val Leu Gly Asp Gly Ser Val Met Pro Tyr Arg Ala 
220 225 230 

ATG CCT AAG ACC AAT ACT TAT ACG CTT CCT GCT TCT CTG CCT CAG AAT 2536 
Met Pro Lys Thr Asn Thr Tyr Thr Leu Pro Ala Ser Leu Pro Gin Asn 
235 240 245 

CAG GCT TCT TAT AGC ATT CAG GCT TCT GCC GGT TCT TAC GTA GCT ATT 
Gin Ala Ser Tyr Ser He Gin Ala Ser Ala Gly Ser Tyr Val Ala He 
250 255 260 265 

TCT AAA GAT GGA GTT TTG TAT GGA ACA GGT GTT GCT AAT GCC AGC GGT 2632 
Ser Lys Asp Gly Val Leu Tyr Gly Thr Gly Val Ala Asn Ala Ser Gly 
2^0 275 280 

GTT GCG ACT GTG AAT ATG ACT AAG CAG ATT ACG GAA AAT GGT AAT TAT 2 680 

Val Ala Thr Val Asn Met Thr Lys Gin He Thr Glu Asn Gly Asn Tyr 
285 290 295 

GAT GTA GTT ATC ACT CGC TCT AAT TAT CTT CCT GTG ATC AAG CAA ATT 2728 
Asp Val Val He Thr Arg Ser Asn Tyr Leu Pro Val He Lys Gin He 
300 305 310 

CAG GCA GGA GAG CCT AGC CCC TAC CAG CCT GTT TCC AAC TTG ACT GCT 
Gin Ala Gly Glu Pro Ser Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala 
315 320 325 

ACA ACG CAG GGT CAG AAA GTA ACG CTC AAG TGG GAT GCC CCG AGC GCA 2824 
Thr Thr Gin Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Ala 
330 335 340 345 

AAG AAG GCA GAA GCT TCC CGT GAA GTA AAA CGG ATC GGA GAC GGT CTT 2 872 

Lys Lys Ala Glu Ala Ser Arg Glu Val Lys Arg He Gly Asp Gly Leu 
350 355 360 

TTC GTT ACG ATC GAA CCT GCA AAC GAT GTA CGT GCC AAC GAA GCC AAG 2920 
Phe Val Thr He Glu Pro Ala Asn Asp Val Arg Ala Asn Glu Ala Lys 
365 370 375 



2584 



2776 
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GTT GTG CTC GCA GCA GAC AAC GTA TGG GGA GAC AAT ACG GGT TAC CAG 2968 
Val Val Leu Ala Ala Asp Asn Val Trp Gly Asp Asn Thr Gly Tyr Gin 
380 385 390 

TTC TTG TTG GAT GCC GAT CAC AAT ACA TTC GGA AGT GTC ATT CCG GCA 3016 
Phe Leu Leu Asp Ala Asp His Asn Thr Phe Gly Ser Val lie Pro Ala 
395 400 405 

ACC GGT CCT CTC TTT ACC GGA ACA GCT TCT TCC AAT CTT TAC AGT GCG 3064 
Thr Gly Pro Leu Phe Thr Gly Thr Ala Ser Ser Asn Leu Tyr Ser Ala 
410 415 420 425 

AAC TTC GAG TAT TTG ATC CCG GCC AAT GCC GAT CCT GTT GTT ACT ACA 3112 
Asn Phe Glu Tyr Leu lie Pro Ala Asn Ala Asp Pro Val Val Thr Thr 
430 435 440 

CAG AAT ATT ATC GTT ACA GGA CAG GGT GAA GTT GTA ATC CCC GGT GGT 3160 
Gin Asn He He Val Thr Gly Gin Gly Glu Val Val He Pro Gly Gly 
445 450 455 

GTT TAC GAC TAT TGC ATT ACG AAC CCG GAA CCT GCA TCC GGA AAG ATG 3208 
Val Tyr Asp Tyr Cys He Thr Asn Pro Glu Pro Ala Ser Gly Lys Met 
460 465 470 

TGG ATC GCA GGA GAT GGA GAC AAC CAG CCT GCA CGT TAT GAC GAT TTC 3256 
Trp He Ala Gly Asp Gly Asp Asn Gin Pro Ala Arg Tyr Asp Asp Phe 
475 480 485 

ACA TTC GAA GCA GGC AAG AAG TAC ACC TTC ACG ATG CGT CGC GCC GGA 3304 
Thr Phe Glu Ala Gly Lys Lys Tyr Thr Phe Thr Met Arg Arg Ala Gly 
490 495 500 505 

ATG GGA GAT GGA ACT GAT ATG GAA GTC GAA GAC GAT TCA CCT GCA AGC 3352 
Met Gly Asp Gly Thr Asp Met Glu Val Glu Asp Asp Ser Pro Ala Ser 
510 515 520 

TAT ACC TAT ACA GTC TAT CGT GAC GGC ACG AAG ATC AAG GAA GGT CTG 3400 
Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu 
525 530 535 

ACG GCT ACG ACA TTC GAA GAA GAC GGT GTA GCT GCA GGC AAT CAT GAG 3448 
Thr Ala Thr Thr Phe Glu Glu Asp Gly Val Ala Ala Gly Asn His Glu 
540 545 550 

TAT TGC GTG GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA TGT 3496 
Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys 
555 560 565 

AAA GAC GTT ACG GTA GAA GGA TCC AAT GAA TTT GCT CCT GTA CAG AAC 3544 
Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin Asn 
570 575 580 585 

CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTT AAG TGG GAT GCA 3592 
Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala 
590 595 600 

CCT AAT GGT ACC CCA AAT CCG AAT CCG AAT CCG AAT CCG GGA ACA ACA 3640 
Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly Thr Thr 
605 610 615 

ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT GCC TCA TGG AAG ACG 3688 
Thr Leu Ser Glu Ser Phe Glu Asn Gly He Pro Ala Ser Trp Lys Thr 
620 625 630 
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ATC GAT GCA GAC GGT GAC GGG CAT GGC TGG AAA CCT GGA AAT GCT CCC 3736 
lie Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly Asn Ala Pro 
635 640 645 

GGA ATC GCT GGC TAC AAT AGC AAT GGT TGT GTA TAT TCA GAG TCA TTC 3784 
Gly lie Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser Glu Ser Phe 
650 655 660 665 

GGT CTT GGT GGT ATA GGA GTT CTT ACC CCT GAC AAC TAT CTG ATA ACA 3832 
Gly Leu Gly Gly He Gly Val Leu Thr Pro Asp Asn Tyr Leu He Thr 
670 675 680 

CCG GCA TTG GAT TTG GCT AAC GGA GGT AAG TTG ACT TTC TGG GTA TGC 3880 
Pro Ala Leu Asp Leu Ala Asn Gly Gly Lys Leu Thr Phe Trp Val Cys 
685 690 695 

GCA GAG GAT GCT AAT TAT GCA TCC GAG CAC TAT GCG GTG TAT GCA TCT 3928 
Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr Ala Ser 
700 705 710 

TCG ACC GGT AAC GAT GCA TCC AAC TTC ACG AAT GCT TTG TTG GAA GAG 3976 
Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu Leu Glu Glu 
715 720 725 

ACG ATT ACG GCA AAA GGT GTT CGC TCG CCG GAA GCT ATT CGT GGT CGT 4024 
Thr He Thr Ala Lys Gly Val Arg Ser Pro Glu Ala He Arg Gly Arg 
730 735 740 745 

ATA CAG GGT ACT TGG CGC CAG AAG ACG GTA GAC CTT CCC GCA GGT ACG 4072 
lie Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro Ala Gly Thr 
750 755 760 

AAA TAT GTT GCT TTC CGT CAC TTC CAA AGC ACG GAT ATG TTC TAC ATC 4120 
Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met Phe Tyr He 
765 770 775 

GAC CTT GAT GAG GTT GAG ATC AAG GCC AAT GGC AAG CGC GCA GAC TTC 4168 
Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala Asp Phe 
780 785 790 

ACG GAA ACG TTC GAG TCT TCT ACT CAT GGA GAG GCA CCA GCG GAA TGG 4216 
Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala Glu Trp 
795 800 805 

ACT ACT ATC GAT GCC GAT GGC GAT GGT CAG GAT TGG CTC TGT CTG TCT 4264 
Thr Thr He Asp Ala Asp Gly Asp Gly Gin Asp Trp Leu Cys Leu Ser 
810 815 820 825 

TCC GGA CAA TTG GAC TGG CTG ACA GCT CAT GGC GGC ACC AAC GTA GTA 4312 
Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn Val Val 
830 835 840 

GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG AAT CCT GAT AAC TAT CTC 4360 
Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn Tyr Leu 
845 850 855 

ATC TCA AAG GAT GTT ACA GGC GCA ACG AAG GTA AAG TAC TAC TAT GCA 4408 
He Ser Lys Asp Val Thr Gly Ala Thr Lys Val Lys Tyr Tyr Tyr Ala 
860 865 870 

GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG GTG ATG ATC TCC AAG 4456 
Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met He Ser Lys 
875 880 885 
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ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT TTC GAA GAA ACG CCT 4504 

Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu Thr Pro 
890 895 900 905 

AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT CTT TCC ACG GAA GCC 4552 
Asn Gly lie Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr Glu Ala 
910 915 920 

AAT GGC GCC AAA CCT GAA AGT GTA TGG ATC GAG CGT ACG GTA GAT TTG 4600 
Asn Gly Ala Lys Pro Gin Ser Val Trp lie Glu Arg Thr Val Asp Leu 
925 930 935 

CCT GCG GGC ACG AAG TAT GTT GCT TTC CGT CAC TAC AAT TGC TCG GAT 4648 
Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Ser Asp 
940 945 950 

TTG GAC TAC ATT CTT TTG GAT GAT ATT CAG TTC ACC ATG GGT GGC AGC 
Leu Asp Tyr lie Leu Leu Asp Asp lie Gin Phe Thr Met Gly Gly Ser 
955 960 965 

CCC ACC CCG ACC GAT TAT ACC TAC ACG GTA TAT CGT GAT GGT ACG AAG 
Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys 
3/0 975 980 985 

ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA GAA GAC GGC GTA GCT 
lie Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly Val Ala 
990 995 * 1000 

ACG GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG TAC ACA GCC GGC GTA 
Thr Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val 
1005 ioio 10 i5 

TCT CCG AAG GTG TGT GTA AAC GTA ACT ATT AAT CCG ACT CAG TTC AAT 
ser Pro Lys Val Cys Val Asn Val Thr He Asn Pro Thr Gin Phe Asn 
1020 1025 1030 

CCT GTA. AAG AAC CTG AAG GCA CAA CCG GAT GGC GGC GAC GTG GTT CTC 
Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly Gly Asp Val Val Leu 
1035 1040 1045 

AAG TGG GAA GCC CCG AGT GGC AAA CGA GGA GAA CTG CTT AAT GAA GAT 
Lys Trp Glu Ala Pro Ser Gly Lys Arg Gly Glu Leu Leu Asn Glu Asp 
1050 10 55 1060 1065 

TTT GAA GGA GAC GCT ATT CCC ACA GGG TGG ACA GCA TTG GAT GCC GAT 
Phe Glu Gly Asp Ala He Pro Thr Gly Trp Thr Ala Leu Asp Ala Asp 
1070 1075 1080 

S3 2£ ?fl f T 1*° ^ MC ACG CTC ^ ^T ACG CGA GGA 

Gly Asp Gly Asn Asn Trp Asp He Thr Leu Asn Glu Phe Thr Arg Gly 
1085 1090 10 95 

£u So SS JIT f TT I 0 * o CT P A CGC GCC AGC GTA «* ATA ™* 

Glu Arg His Val Leu Ser Pro Leu Arg Ala Ser Asn Val Ala lie Ser 
1100 H05 mo 



1*1 If! If! ? TA 5 TT S™ 5 °? T M »«• 'AT TTG CCT CTC ACG CCG AAC 

Pro 1 
1125 



Tvr S-r ill il„ t X? "7 , ^ ml TTG CCT CTC ACG CCG AAC 
Tyr Ser ser Leu Leu Gin Gly Gin Glu Tyr Leu Pro Leu Thr Pro Asn 
J.J.J.D 1120 



AAC TTT CTG ATC ACT CCG AAG GTT GAA GGA GCA AAG AAG ATT ACT TAT 
Am Phe Leu He Thr Pro Lys Val Glu Gly Ala Lys Jyt Til 55 Tyr 

1135 1140 1145 



4696 



4744 



4792 



4840 



4888 



4936 



4984 



5032 



5080 



5128 



5176 



5224 
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AAG GTG GGT TCA CCG GGT CTT CCT CAA TGG AGT CAT GAT CAT TAT GCA 5272 
Lys Val Gly Ser Pro Gly Leu Pro Gin Trp Ser His Asp His Tyr Ala 
1150 1155 1160 

CTC TGT ATC TCC AAG AGC GGA ACG GCT GCA GCC GAC TTC GAA GTA ATC 5320 
Leu Cys lie Ser Lys Ser Gly Thr Ala Ala Ala Asp Phe Glu Val lie 
1165 1170 1175 

TTT GAA GAA ACG ATG ACC TAC ACT CAA GGA GGA GCC AAC TTG ACA AGA 5368 
Phe Glu Glu Thr Met Thr Tyr Thr Gin Gly Gly Ala Asn Leu Thr Arg 
1180 1185 1190 

GAA AAA GAC CTC CCT GCC GGC ACG AAA TAT GTC GCT TTC CGT CAT TAC 5416 
Glu Lys Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr 
1195 1200 1205 

AAT TGC ACG GAT GTT CTG GGC ATA ATG ATT GAC GAT GTA GTG ATA ACA 5464 
Asn Cys Thr Asp Val Leu Gly lie Met lie Asp Asp Val Val lie Thr 
1210 1215 1220 1225 

GGT GAA GGC GAA GGT CCC AGT TAC ACC TAC ACG GTG TAT CGT GAC GGC 5512 
Gly Glu Gly Glu Gly Pro Ser Tyr Thr Tyr Thr Val Tyr Arg Asp Gly 
1230 1235 1240 

ACG AAG ATC CAG GAA GGT CTG ACC GAA ACG ACC TAC CGC GAT GCA GGA 5560 
Thr Lys lie Gin Glu Gly Leu Thr Glu Thr Thr Tyr Arg Asp Ala Gly 
1245 1250 1255 

ATG AGT GCA CAA TCT CAT GAG TAT TGC GTA GAG GTT AAG TAC GCA GCC 5608 
Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu Val Lys Tyr Ala Ala 
1260 1265 1270 



GGC GTA TCT CCG AAG GTT TGT GTG GAT TAT ATT CCT GAT GGA GTG GCA 5656 
Gly Val Ser Pro Lys Val Cys Val Asp Tyr lie Pro Asp Gly Val Ala 
1275 1280 1285 

GAC GTA ACT GCT CAG AAG CCT TAC ACG CTG ACG GTT GTA GGA AAG ACT 57 04 

Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly Lys Thr 
1290 1295 1300 1305 



ATC ACG GTA ACT TGC CAA GGC GAA GCT ATG ATC TAC GAC ATG AAC GGT 57 52 

lie Thr Val Thr Cys Gin Gly Glu Ala Met He Tyr Asp Met Asn Gly 
1310 1315 1320 

CGT CGT CTG GCA GCG GGT CGC AAC ACG GTT GTT TAC ACG GCT CAG GGC 5800 
Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala Gin Gly 
1325 1330 1335 



GGC TAC TAT GCA GTC ATG GTT GTC GTT GAC GGC AAG TCT TAC GTA GAG 5848 
Gly Tyr Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr Val Glu 
1340 1345 1350 

AAA CTC GCT ATC AAG TAA TTCTGTCTTG GACTCGGAGA CTTTGTGCAG 5896 
Lys Leu Ala He Lys 
1355 



ACACTTTTAA TATAGGTCTG TAATTGTCTC AGAGT AT GAA TCGGTCGCCC GACTT C CTT A 5956 

AAAGGAGGTC GGGCGACTTC GTTTTTATTA TTGCTGTCTG GTAAACTTGT CAAGAGGAGA 6016 

CCTTTGAAAA ATGGGGCGGT CAATAATTTT CGGTCTATGG GTCAAATTGC AGGCTACTGT 6076 

TTTAGGT GTA TGTTGGGCTA T CTT CCT ATC TTTAAGAGAC CTTTGAAAAA TAAGGAGATG 6136 



BNSDOCID: <WO 9617936A2_I_> 



WO 96/17936 

PCT/US95/16108 

124 

GAGGGAAGAG GAGTTCTTGG CATAAAAGGA GCGAGTGAAA G GGGT GGCAG TAAGGAGTGA 6196 
AAGTAGTTGT AAATCCCCCC TTTGAGGAGC TACT T GT AC G AGCTC 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Arg Lys Leu Leu Leu Leu He Ala Ala Ser Leu Leu Gly Val Gly 

5 10 



15 



Leu Tyr Ala Gin Ser Ala Lys He Lys Leu Asp Ala Pro Thr Thr Arg 
20 25 30 

Thr Thr Cys Thr Asn Asn Ser Phe Lys Gin Phe Asp Ala Ser Phe Ser 
J:) 40 45 

Phe Asn Glu Val Glu Leu Thr Lys Val Glu Thr Lys Gly Gly T hr Phe 

55 60 

Ala ser Val Ser He Pro Gly Ala Phe Pro Thr Gly Glu Val Gly Ser 

70 75 80 

Pro Glu val Pro Ala Val Arg Lys Leu He Ala Val Pro Val Gly Ala 

85 9 ° 95 

Thr Pro Val Val Arg Val Lys Ser Phe Thr Glu Gin Val Tyr Ser Leu 

Asn Gin Tyr Gly ser Glu Lys Leu Met Pro His Gin Pro Ser Met Ser 



12 0 



125 



Lys ser Asp Asp Pro Glu Lys Val Pro Phe Val Tyr Asn Ala Ala Ala 

i <3 O 



140 



Tyr Ala Arg Lys Gly Phe Val Gly Gin Glu Leu Thr Gin Val Glu Met 

Leu Gly Thr Met Arg Gly Val Arg He Ala Ala Leu Thr lie Asn Pro 

Val Gin Tyr Asp Val Val Ala Asn Gin Leu Lys Val Arg Asn Asn He 

185 190 

Glu He Glu Val ser Phe Gin Gly Ala Asp Glu Val Ala Thr Gin Arg 

200 205 

Leu Tyr Asp Ala Ser Phe Ser Pro Tyr Phe Glu Thr Ala Tyr Lys Gin 

215 220 

Leu Phe Asn Arg Asp Val Tyr Thr Asp His Gly Asp Leu Tyr Asn Thr 

230 235 240 

Pro Val Arg Met Leu Val Val Ala Gly Ala Lys Phe Lys Glu Ala Leu 
245 "0 255 



6241 
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Lys Pro Trp Leu Thr Trp Lys Ala Gin Lys Gly Phe Tyr Leu Asp Val 
260 265 270 

His Tyr Thr Asp Glu Ala Glu Val Gly Thr Thr Asn Ala Ser He Lys 
275 280 285 

Ala Phe He His Lys Lys Tyr Asn Asp Gly Leu Ala Ala Ser Ala Ala 
290 295 300 

Pro Val Phe Leu Ala Leu Val Gly Asp Thr Asp Val lie Ser Gly Glu 
305 310 315 320 

Lys Gly Lys Lys Thr Lys Lys Val Thr Asp Leu Tyr Tyr Ser Ala Val 
325 330 335 

Asp Gly Asp Tyr Phe Pro Glu Met Tyr Thr Phe Arg Met Ser Ala Ser 
340 345 350 

Ser Pro Glu Glu Leu Thr Asn He He Asp Lys Tyr 
355 360 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1358 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Tyr Glu Lys Ala Thr Met Pro Asp Lys Ser Tyr Leu Glu Lys Ala 
1 * 10 15 

Leu Leu He Ala Gly Ala Asp Ser Tyr Trp Asn Pro Lys He Gly Gin 
20 25 30 

Gin Thr He Lys Tyr Ala Val Gin Tyr Tyr Tyr Asn Gin Asp His Gly 
35 40 45 

Tyr Thr Asp Val Tyr Ser Tyr Pro Lys Ala Pro Tyr Thr Gly Cys Tvr 
50 55 60 

Ser His Leu Asn Thr Gly Val Gly Phe Ala Asn Tyr Thr Ala His Gly 
65 70 75 80 

Ser Glu Thr Ser Trp Ala Asp Pro Ser Leu Thr Ala Thr Gin Val Lys 
85 90 95 

Ala Leu Thr Asn Lys Asp Lys Tyr Phe Leu Ala He Gly Asn Cys Cys 
100 105 110 

Val Thr Ala Gin Phe Asp Tyr Pro Gin Pro Cys Phe Gly Glu Val Met 
115 120 125 

Thr Arg Val Lys Glu Lys Gly Ala Tyr Ala Tyr He Gly Ser Ser Pro 
130 135 140 

Asn Ser Tyr Trp Gly Glu Asp Tyr Tyr Trp Ser Val Gly Ala Asn Ala 
145 150 155 160 
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Val Phe Gly Val Gin Pro Thr Phe Glu Gly Thr Ser Met Gly Ser Tyr 
165 170 175 

Asp Ala Thr Phe Leu Glu Asp Ser Tyr Asn Thr Val Asn Ser He Met 
180 185 190 

Trp Ala Gly Asn Leu Ala Ala Thr His Ala Gly Asn He Gly Asn lie 
195 200 205 

Thr His He Gly Ala His Tyr Tyr Trp Glu Ala Tyr His Val Leu Gly 
210 215 220 

Asp Gly Ser Val Met Pro Tyr Arg Ala Met Pro Lys Thr Asn Thr Tyr 
225 230 235 240 

Thr Leu Pro Ala Ser Leu Pro Gin Asn Gin Ala Ser Tyr Ser He Gin 
245 250 255 

Ala Ser Ala Gly Ser Tyr Val Ala He Ser Lys Asp Gly Val Leu Tyr 
260 265 270 

Gly Thr Gly Val Ala Asn Ala Ser Gly Val Ala Thr Val Asn Met Thr 
275 280 285 

Lys Gin He Thr Glu Asn Gly Asn Tyr Asp Val Val He Thr Arg Ser 
290 295 300 

Asn Tyr Leu Pro Val He Lys Gin He Gin Ala Gly Glu Pro Ser Pro 
3 °5 310 315 320 

Tyr Gin Pro Val Ser Asn Leu Thr Ala Thr Thr Gin Gly Gin Lys Val 
325 330 335 

Thr Leu Lys Trp Asp Ala Pro Ser Ala Lys Lys Ala Glu Ala Ser Arg 
340 345 350 

Glu Val Lys Arg He Gly Asp Gly Leu Phe Val Thr He Glu Pro Ala 
355 360 365 

Asn Asp Val Arg Ala Asn Glu Ala Lys Val Val Leu Ala Ala Asp Asn 
370 375 380 

Val Trp Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp His 
385 390 395 400 

Asn Thr Phe Gly Ser Val He Pro Ala Thr Gly Pro Leu Phe Thr Gly 
405 410 415 

Thr Ala Ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu He Pro 
420 425 430 

Ala Asn Ala Asp Pro Val Val Thr Thr Gin Asn He He Val Thr Gly 
435 440 445 

Gin Gly Glu Val Val lie Pro Gly Gly Val Tyr Asp Tyr Cys He Thr 
450 455 460 

Asn Pro Glu Pro Ala Ser Gly Lys Met Trp lie Ala Gly Asp Gly Asp 
465 470 475 480 

Asn Gin Pro Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala Gly Lys Lys 
485 490 495 
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Tyr Thr Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp Met 
500 505 510 

Glu Val Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr Arg 
515 520 525 

Asp Gly Thr Lys He Lys Glu Gly Leu Thr Ala Thr Thr Phe Glu Glu 
530 535 540 

Asp Gly Val Ala Ala Gly Asn His Glu Tyr Cys Val Glu Val Lys Tyr 
545 550 555 560 

Thr Ala Gly Val Ser Pro Lys Val Cys Lys Asp Val Thr Val Glu Gly 
565 570 575 

Ser Asn Glu Phe Ala Pro Val Gin Asn Leu Thr Gly Ser Ala Val Gly 
580 585 590 

Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn Gly Thr Pro Asn Pro 
595 600 605 

Asn Pro Asn Pro Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu 
610 615 620 

Asn Gly He Pro Ala Ser Trp Lys Thr He Asp Ala Asp Gly Asp Gly 
"5 630 635 640 

His Gly Trp Lys Pro Gly Asn Ala Pro Gly He Ala Gly Tyr Asn Ser 
645 650 655 

Asn Gly Cys Val Tyr Ser Glu Ser Phe Gly Leu Gly Gly He Gly Val 
660 665 670 

Leu Thr Pro Asp Asn Tyr Leu He Thr Pro Ala Leu Asp Leu Ala Asn 
675 680 685 

Gly Gly Lys Leu Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala 
690 695 700 

Ser Glu His Tyr Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser 
70$ 710 715 720 

Asn Phe Thr Asn Ala Leu Leu Glu Glu Thr He Thr Ala Lys Gly Val 
725 730 735 

Arg Ser Pro Glu Ala He Arg Gly Arg He Gin Gly Thr Trp Arg Gin 
740 745 750 

Lys Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His 
755 760 765 

Phe Gin Ser Thr Asp Met Phe Tyr He Asp Leu Asp Glu Val Glu He 
770 775 780 

Lys Ala Asn Gly Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser 
785 790 795 800 

Thr His Gly Glu Ala Pro Ala Glu Trp Thr Thr He Asp Ala Asp Gly 
805 810 815 

Asp Gly Gin Asp Trp Leu Cys Leu Ser Ser Gly Gin Leu Asp Trp Leu 
620 825 830 
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Thr Ala His Gly Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn Gly 
835 840 845 

Met Ala Leu Asn Pro Asp Asn Tyr Leu lie Ser Lys Asp Val Thr Gly 
850 855 860 

Ala Thr Lys Val Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly 
865 870 875 8 80 

Asp His Tyr Ala Val Met lie Ser Lys Thr Gly Thr Asn Ala Gly Asp 
885 890 895 

Phe Thr Val Val Phe Glu Glu Thr Pro Asn Gly lie Asn Lys Gly Gly 
900 905 910 

Ala Arg Phe Gly Leu Ser Thr Glu Ala Asn Gly Ala Lys Pro Gin Ser 
915 920 925 

Val Trp lie Glu Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val 
930 935 940 

Ala Phe Arg His Tyr Asn Cys Ser Asp Leu Asp Tyr lie Leu Leu Asp 
545 950 955 9 60 

Asp He Gin Phe Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr 
965 970 975 

Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu 
980 985 990 

Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys 
995 1000 1005 

Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Val Asn 
1010 1015 1020 

Val Thr lie Asn Pro Thr Gin Phe Asn Pro Val Lys Asn Leu Lvs Ala 
10 " 1030 1035 y 1040 

Gin Pro Asp Gly Gly Asp Val Val Leu Lys Trp Glu Ala Pro Ser Gly 
1045 1050 1055 

Lys Arg Gly Glu Leu Leu Asn Glu Asp Phe Glu Gly Asp Ala He Pro 
1060 1065 1070 

Thr Gly Trp Thr Ala Leu Asp Ala Asp Gly Asp Gly Asn Asn Trp Asp 
1075 1080 1085 

He Thr Leu Asn Glu Phe Thr Arg Gly Glu Arg His Val Leu Ser Pro 
1090 1095 HOO 

Leu Arg Ala Ser Asn Val Ala He Ser Tyr Ser Ser Leu Leu Gin Glv 
1105 IHO H15 1120 

Gin Glu Tyr Leu Pro Leu Thr Pro Asn Asn Phe Leu He Thr Pro Lys 
1125 H30 H35 

Val Glu Gly Ala Lys Lys He Thr Tyr Lys Val Gly Ser Pro Gly Leu 
1140 H45 H50 

Pro Gin Trp Ser His Asp His Tyr Ala Leu Cys He Ser Lys Ser Glv 
II 55 1160 ii 6 5 * 
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Thr Ala Ala Ala Asp Phe Glu Val lie Phe Glu Glu Thr Met Thr Tyr 
1170 1175 1180 

Thr Gin Gly Gly Ala Asn Leu Thr Arg Glu Lys Asp Leu Pro Ala Gly 
1185 1190 1195 1200 

Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys Thr Asp Val Leu Gly 
1205 1210 1215 

lie Met lie Asp Asp Val Val lie Thr Gly Glu Gly Glu Gly Pro Ser 
1220 1225 1230 

Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Gin Glu Gly Leu 
1235 1240 1245 

Thr Glu Thr Thr Tyr Arg Asp Ala Gly Met Ser Ala Gin Ser His Glu 
1250 1255 1260 

Tyr Cys Val Glu Val Lys Tyr Ala Ala Gly Val Ser Pro Lys Val Cys 
1265 1270 1275 1280 

Val Asp Tyr lie Pro Asp Gly Val Ala Asp Val Thr Ala Gin Lys Pro 
1285 1290 1295 

Tyr Thr Leu Thr Val Val Gly Lys Thr lie Thr Val Thr Cys Gin Gly 
1300 1305 1310 

Glu Ala Met lie Tyr Asp Met Asn Gly Arg Arg Leu Ala Ala Gly Arg 
1315 1320 1325 

Asn Thr Val Val Tyr Thr Ala Gin Gly Gly Tyr Tyr Ala Val Met Val 
1330 1335 1340 

Val Val Asp Gly Lys Ser Tyr Val Glu Lys Leu Ala lie Lys 
1345 1350 1355 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8640 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 971.. 6031 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

AGGCCTTTGA GACGGGCACA AGCCGCCGCA GCCTCCTCTT CGAAGGTGTC TCGAACGTCC 60 

ACATCGGTGA ATCCGTAGCA GTGCTCATTG CCATTGAGCA GCACCGAGGT GTGGCGCATC 120 

AGATATATTT TCATCAGTGG ATTATTAGGG TATCGGTCAG AAAAAGCCTT CCGAATCCGA 18 0 

CAAAGATAGT AGAAAGAGAG TGCATCTGAA AACAGAT CAT TCGAGGATTA TCGATCAACT 240 

GAAAAGGCAG GAGTTGTTTT GCGTTTTGGT TCGGAAAATT ACCTGATCAG CATTCGTAAA 300 
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AACGTGGCGC GAGAATTTTT TCGTTTTGGC GCGAGAATTA AAAATTTTTG GAACCACAGC 360 

GAAAAAAATC TCGCGCCGTT TTCTCAGGAT TTACAGACCA CAATCCGAGC ATTTTCGGTT 420 

CGTAATTCAT CGAAGAGACA GGTTTTACCG CATTGAAATC AGAGAGAGAA TAT C CGTAGT 480 

CCAACGGTTC AT CCTTATAT CAGAGGTTAA AAGATATGGT ACGCTCATCG AGGAGCTGAT 54 0 

TGGCTTAGTA GGTGAGACTT TCTTAAGAGA CTATCGGCAC CTACAGGAAG TTCATGGCAC 600 

ACAAGGCAAA GGAGGCAATC TTCGCAGACC GGACTCATAT CAAAAGGATG AAACGACTTT 660 

TCCATACGAC AAC CAAATAG CCGTCTACGG TAGACGAATG CAAACCCAAT ATGAGGCCAT 720 

CAATCAATCC GAATGACAGC TTTTGGGCAA TATATTATGC ATATTTTGAT TCGCGTTTAA 780 

AGGAAAAGTG CATATATTTG CGATTGTGGT ATTTCTTTCG GTTTCTATGT GAATTTTGTC 840 

TCCCAAGAAG ACTTTATAAT GCATAAATAC AGAAGGGGTA CTACACAGTA AAAT CAT ATT 900 

CTAATTTCAT CAAAATGAAA AACTTGAACA AGTTTGTTTC ATTGCTCTTT GCTCTTCCTT 960 

ATTAGGAGGA ATG GCA TTT GCG CAG CAG ACA GAG TTG GGA CGC AAT CCG 1009 
Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro 
1360 1365 i37 0 

AAT GTC AGA TTG CTC GAA TCC ACT CAG CAA TCG GTG ACA AAG GTT CAG 1057 
Asn Val Arg Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin 
1375 i3 8 o 1385 

TTC CGT ATG GAC AAC CTC AAG TTC ACC GAA GTT CAA ACC CCT AAG GGA 1105 

6 ^I« Met ** P ASn LSU LyS Phe Thr Glu Val Thr Pro £y"J 1105 

1390 1395 i 4 00 y 

ATG GCA CAA GTG CCG ACC TAT ACA GAA GGG GTT AAT CTT TCC GAA AAA 
Met Ala Gin Val Pro Thr Tyr Thr Glu Gly Val Asn Leu Ser Glu £yt 
1405 1410 1415 1 J 20 

GGG ATG CCT ACG CTT CCC ATT CTA TCA CGC TCT TTG GCG GTT TCA GAC 
Gly Met Pro Thr Leu Pro He Leu Ser Arg Ser Leu Ala Val Sp 
1425 1430 1435 

ACT CGT GAG ATG AAG GTA GAG GTT GTT TCC TCA AAG TTC ATC GAA AAG 
Thr Arg Glu Met Lys Val Glu Val Val Ser Ser Lys Phe He Glu its 
1440 1445 145Q y 

AAA AAT GTC CTG ATT GCA CCC TCC AAG GGC ATG ATT ATG CGT AAC GAA 
Lys Asn Val Leu lie Ala Pro Ser Lys Gly Met He Met X?g JJS Su" 
1,55 1460 1465 

GAT CCG AAA AAG ATC CCT TAC GTT TAT GGA AAG AGC TAC TCG CAA AAC 
Asp Pro Lys Lys He Pro Tyr Val Tyr Gly Lys Ser Tyr Ser SJ JJ£ 

1475 



1480 



AAA TTC TTC CCG GGA GAG ATC GCC ACG CTT GAT GAT CCT TTT ATC CTT 
Lys Phe Phe Pro Gly Glu He Ala Thr Leu Asp Asp Pro III 55 Su 
85 1490 1495 15O0 

CGT GAT GTG CGT GGA CAG GTT GTA AAC TTT GCG CCT TTG CAG TAT AAC 
Arg Asp val Arg Gly Gin Val Val Asn Phe Ala Pro Leu 2S lyr £n 
1505 15 10 z i 15 



1153 



1201 



1249 



1297 



1345 



1393 



1441 
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CCT 
Pro 


GTG 
Val 


ACA 
Thr 


AAG ACG 
Lys Thr 
1520 


TTG 
Leu 


CGC 
Arg 


ATC 
He 


TAT ACG 
Tyr Thr 
1525 


GAA 
Glu 


ATC 
He 


ACT 
Thr 


GTG GCA 
Val Ala 
1530 


GTG 
Val 


1489 


AGC 
Ser 


GAA 
Glu 


ACT TCG 
Thr Ser 
1535 


GAA 
GlU 


CAA 

Gin 


GGC 
Gly 


AAA AAT 
Lys Asn 
1540 


ATT 
He 


CTG 
Leu 


AAC 
Asn 


AAG AAA 
Lys Lys 
1545 


GGT 
Gly 


ACA 
Thr 


1537 


TTT 
Phe 


GCC GGC 
Ala Gly 
1550 


TTT 
Phe 


GAA 
Glu 


GAC 
Asp 


ACA TAC 
Thr Tyr 
1555 


AAG 
Lys 


CGC 
Arg 


ATG 
Met 


TTC ATG 
Phe Met 
1560 


AAC 
Asn 


TAC 
Tyr 


GAG 
Glu 


1585 


CCG GGG 
Pro Gly 
1565 


CGT 
Arg 


TAC 
Tyr 


ACA 
Thr 


CCG GTA 
Pro Val 
1570 


GAG 
Glu 


GAA 
Glu 


AAA 

Lys 


CAA AAT 
Gin Asn 
1575 


GGT 
Gly 


CGT 
Arg 


ATG 
Met 


ATC 
He 
1580 


1633 


GTC 
Val 


ATC 
lie 


GTA 
Val 


GCC 
Ala 


AAA AAG 

Lys Lys 
1585 


TAT 
Tyr 


GAG 
Glu 


GGA 
Gly 


GAT ATT 
Asp He 
1590 


AAA 

Lys 


GAT 
Asp 


TTC 
Phe 


GTT GAT 
Val Asp 
1595 


1681 


TGG 
Trp 


AAA 

Lys 


AAC 
Asn 


CAA CGC 
Gin Arg 
1600 


GGT 
Gly 


CTC 
Leu 


CGT 
Arg 


ACC GAG 
Thr Glu 
1605 


GTG 
Val 


AAA 

Lys 


GTG 
Val 


GCA GAA 
Ala Glu 
1610 


GAT 
Asp 


1729 


ATT 
lie 


GCT 
Ala 


TCT CCC 
Ser Pro 
1615 


GTT 
Val 


ACA 
Thr 


GCT 
Ala 


AAT GCT 
Asn Ala 
1620 


ATT 
He 


CAG 
Gin 


CAG 
Gin 


TTC GTT 
Phe Val 
1625 


AAG 
Lys 


CAA 
Gin 


1777 


GAA 

Glu 


TAC GAG 
Tyr Glu 
1630 


AAA 

Lys 


GAA 
Glu 


GGT 
Gly 


AAT GAT 
Asn Asp 
1635 


TTG 
Leu 


ACC 
Thr 


TAT 
Tyr 


GTT CTT 
Val Leu 
1640 


TTG 
Leu 


GTT 
Val 


GGC 
Gly 


1825 


GAT CAC 
Asp His 
1645 


AAA 

Lys 


GAT 
Asp 


ATT 
He 


CCT GCC 
Pro Ala 
1650 


AAA 
Lys 


ATT 
He 


ACT 
Thr 


CCG GGG 
Pro Gly 
1655 


ATC 
He 


AAA 
Lys 


TCC 
Ser 


GAC 
Asp 
1660 


1873 


GAG 
Gin 


GTA 
Val 


TAT 
Tyr 


GGA 
Gly 


CAA ATA 
Gin He 
1665 


GTA 
Val 


GGT 
Gly 


AAT 
Asn 


GAC CAC 
Asp His 
1670 


TAC 
Tyr 


AAC 
Asn 


GAA 
Glu 


GTC TTC 
Val Phe 
1675 


1921 


ATC 
lie 


GGT 
Gly 


CGT 
Arg 


TTC TCA 
Phe Ser 
1680 


TGT 
Cys 


GAG 
Glu 


AGC 
Ser 


AAA GAG 
Lys Glu 
1685 


GAT 
Asp 


CTG 
Leu 


AAG 

Lys 


ACA CAA 
Thr Gin 
1690 


ATC 
He 


1969 


GAT 
Asp 


CGG 
Arg 


ACT ATT 
Thr He 
1695 


CAC 
His 


TAT 
Tyr 


GAG 
Glu 


CGC AAT 
Arg Asn 
1700 


ATA 
He 


ACC 
Thr 


ACG 
Thr 


GAA GAC 
Glu Asp 
1705 


AAA 
Lys 


TGG 
Trp 


2017 


CTC 
Leu 


GGT CAG 
Gly Gin 
1710 


GCT 
Ala 


CTT 
Leu 


TGT 
Cys 


ATT GCT 
He Ala 
1715 


TCG 
Ser 


GCT 
Ala 


GAA 
Glu 


GGA GGC 
Gly Gly 
1720 


CCA 
Pro 


TCC 
Ser 


GCA 
Ala 


2065 


GAC AAT 
Asp Asn 
1725 


GGT 
Gly 


GAA 
Glu 


AGT 
Ser 


GAT ATC 
Asp lie 
1730 


CAG 
Gin 


CAT 
His 


GAG 
Glu 


AAT GTA 
Asn Val 
1735 


ATC 
He 


GCC 
Ala 


AAT 
Asn 


CTG 
Leu 
1740 


2113 


CTT 
Leu 


ACC 
Thr 


CAG 
Gin 


TAT 
Tyr 


GGC TAT 
Gly Tyr 
1745 


ACC 
Thr 


AAG 
Lys 


ATT 
He 


ATC AAA 
He Lys 
1750 


TGT 
Cys 


TAT 
Tyr 


GAT 
Asp 


CCG GGA 
Pro Gly 
1755 


2161 


GTA 
Val 


ACT 
Thr 


CCT 
Pro 


AAA AAC 

Lys Asn 
1760 


ATT 
He 


ATT 
He 


GAT 
Asp 


GCT TTC 
Ala Phe 
1765 


AAC 
Asn 


GGA 
Gly 


GGA 
Gly 


ATC TCG 
He Ser 
1770 


TTG 
Leu 


2209 
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SI? ^ ™! ^ G ^ S* 0 G ? T AGC ACA G ^T TGG GGT ACG TCT CAC 

Gly 1 
1785 



— — unn ui_t Tl»5 GGT ACG TCT CAC 

Val Asn Tyr^Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser 2s 



ll C ^u C ACT ^ GTG m CTT ACC AAC AGC AAC CAG CTA CCG 

fly Thr Thr His val Lys Gin Leu Thr Asn Ser Asn Gin Leu Sro 
X ' 9U 1795 1800 

TTT ATT TTC GAC GTA GCT TGT GTG AAT GGC GAT TTC CTA TTC AGC ATG 
Phe lie Phe Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe i£ {£? 
1805 1810 1815 1820 

CCT TGC TTC GCA GAA GCC CTG ATG CGT GCA CAA AAA GAT GGT AAG CCG 
Pro Cys Phe Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Ly? Pro 
1825 1830 1835 

ACA GGT ACT GTT GCT ATC ATA GCG TCT ACG ATC AAC CAG TCT TGG GCT 
Thr Gly Thr Val Ala He He Ala Ser Thr He Asn Gin Ser Tr^ SI 
1840 1845 185 0 

TCT CCT ATG CGC GGG CAG GAT GAG ATG AAC GAA ATT CTG TGC GAA AAA 
Ser Pro Met Arg Gly. Gin Asp Glu Met Asn Glu He Leu Cys J5J 
1855 ia 6 o 1865 * 

25 ^ ^ ^ CGT ACT TTC GGT GGT GTC ACC ATG AAC GGT 

?«™ ASn Ile LyS **» Thr phe G1 y G1 y Val Thr Met Asn Gly 
1870 1875 188 0 



2257 



2305 



2353 



2401 



2449 



2497 



2545 



m TG ok T ff 1 ATG GTG M AAG TAT AAA AAG GAT GGT GAG AAG ATG CTC 
?885 ?8So LyS ^? 5 Gly G1U " ^ "» 

-5* l GG ACT GTT TTC 000 ^ CCC TCG CTG CTC GTT CGT ACA CTT 
Asp Thr Trp Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu 
1905 1910 1915 

GTC CCG ACC AAA ATG CAG GTT ACG GCT CCG GCT CAG ATT AAT TTG ACC 
Val Pro Thr Lys Met Gin Val Thr Ala Pro Ala GlS He £n leu ?S 
1920 1925 i93 0 

GAT GCT TCA GTC AAC GTA TCT TGC GAT TAT AAT GGT GCT ATT GCT ACC 
Asp Ala ser Val Asn Val Ser Cys Asp Tyr Asn Gly SI He SI 
1935 1940 1945 

ATT TCA GCC AAT GGA AAG ATG TTC GGT TCT GCA GTT GTC GAA AAT GCa 
He Ser Ala Asn Gly Lys Met Phe Gly ser Ala vll vlt 2* J£ £J 

1955 1960 

52 SI ?2 tT C f T f TG ACA 661 CTG ACA AAT GAA AGC ACG CTT ACC 2833 
Thr Ala Thr He Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr 

1970 1975 1980 

CTT ACA GTA GTT GGT TAC AAC AAA GAG ACG GTT ATT AAG ACC ATC Mr 
Leu Thr Val Val Gly Tyr Asn Lys Glu Thr Val ill 5£ J£r Til £S 
1985 1990 1995 

£ 5= 5 S2 25 £ £ £SS ?2 S £S SI 
!S 55 S S5 Si? £ S £ £ 5 S S iS S 

2015 2020 2025 



2593 



2641 



2689 



2737 



2785 



2881 



2929 



2977 
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AAA ACC AAT GCA ACC ACT AAT ACC GCT CGC AGC GTG GAT GGC ATA CGA 3025 
Lys Thr Asn Ala Thr Thr Asn Thr Ala Arg Ser Val Asp Gly lie Arg 
2030 2035 2040 

GAA TTG GTT CTT CTG TCA GTC AGC GAT GCC CCC GAA CTT CTT CGC AGC 3073 
Glu Leu Val Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser 
2045 2050 2055 2060 

GGT CAG GCC GAG ATT GTT CTT GAA GCT CAC GAT GTT TGG AAT GAT GGA 3121 
Gly Gin Ala Glu lie Val Leu Glu Ala His Asp Val Trp Asn Asp Gly 
2065 2070 2075 - 

TCC GGT TAT CAG ATT CTT TTG GAT GCA GAC CAT GAT GAA TAT GGA CAG 3169 
Ser Gly Tyr Gin lie Leu Leu Asp Ala Asp His Asp Gin Tyr Gly Gin 
2080 2085 2090 

GTT ATA CCC AGT GAT ACC CAT ACT CTT TGG CCG AAC TGT AGT GTC CCG 3217 
Val He Pro Ser Asp Thr His Thr Leu Trp Pro Asn Cys Ser Val Pro 
2095 2100 2105 

GCC AAT CTG TTC GCT CCG TTC GAA TAT ACT GTT CCG GAA AAT GCA GAT 3265 
Ala Asn Leu Phe Ala Pro Phe Glu Tyr Thr Val Pro Glu Asn Ala Asp 
2110 2115 2120 

CCT TCT TGT TCC CCT ACC AAT ATG ATA ATG GAT GGT ACT GCA TCC GTT 3313 
Pro Ser Cys Ser Pro Thr Asn Met He Met Asp Gly Thr Ala Ser Val 
2125 2130 2135 2140 

AAT ATA CCG GCC GGA ACT TAT GAC TTT GCA ATT GCT GCT CCT CAA GCA 3361 
Asn He Pro Ala Gly Thr Tyr Asp Phe Ala He Ala Ala Pro Gin Ala 
2145 2150 2155 

AAT GCA AAG ATT TGG ATT GCC GGA CAA GGA CCG ACG AAA GAA GAT GAT 3409 
Asn Ala Lys He Trp He Ala Gly Gin Gly Pro Thr Lys Glu Asp Asp 
2160 2165 2170 

TAT GTA TTT GAA GCC GGT AAA AAA TAC CAT TTC CTT ATG AAG AAG ATG 3457 
Tyr Val Phe Glu Ala Gly Lys Lys Tyr His Phe Leu Met Lys Lys Met 
2175 2180 2185 

GGT AGC GGT GAT GGA ACT GAA TTG ACT ATA AGC GAA GGT GGT GGA AGC 3505 
Gly Ser Gly Asp Gly Thr Glu Leu Thr He Ser Glu Gly Gly Gly Ser 
2190 2195 2200 

GAT TAC ACC TAT ACT GTC TAT CGT GAC GGC ACG AAG ATC AAG GAA GGT 3553 
Asp Tyr Thr Tyr Thr Val Tyr Arg Asp Gly Thr Lys lie Lys Glu Gly 
2205 2210 2215 2220 

CTG ACG GCT ACG ACA TTC GAA GAA GAC GGT GTA GCT ACG GGC AAT CAT 3601 
Leu Thr Ala Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His 
2225 2230 2235 

GAG TAT TGC GTG GAA GTT AAG TAC ACA GCC GGC GTA TCT CCG AAG GTA 3649 
Glu Tyr Cys Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val 
2240 2245 2250 

TGT AAA GAC GTT ACG GTA GAA GGA TCC AAT GAA TTT GCT CCT GTA CAG 3697 
Cys Lys Asp Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin 
2255 2260 2265 

AAC CTG ACC GGT AGT GCA GTC GGC CAG AAA GTA ACG CTC AAG TGG GAT 3745 
Asn Leu Thr Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp 
2270 2275 2280 
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GCA CCT AAT GGT ACC CCG AAT CCA AAT CCG AAT CCG AAT CCG AAT CCG 
Ala Pro Asn Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro 
2285 2290 2295 2300 

AAT CCC GGA ACA ACA ACA CTT TCC GAA TCA TTC GAA AAT GGT ATT CCT 
Asn Pro Gly Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly lie Pro 
2305 2310 2315 



TAT TCA GAG TCA TTC GGT CTT GGT GGT ATA GGA GTT CTT ACC CCT GAC 

^ tlla G1U SCr PhC Gly LSU Gly Gly Ile G1 * Val Leu ^r Pro Sp 
2350 2355 2360 

AAC TAT CTG ATA ACA CCG GCA TTG GAT TTG CCT AAC GGA GGT AAG TTG 
Asn^Tyr Leu lie Thr Pro^Ala Leu Asp Leu Pr^Asn Gly Gly £yf Leu 

ACT TTC TGG GTA TGC GCA CAG GAT GCT AAT TAT GCA TCC GAG CAC TAT 
Thr Phe Trp Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tvr 
2385 2390 2 395 

°f G w T f I AT TCT TCG ACC 557 ** C GAT GCA TCC AAC TTC ACG AAT 

Ala Val Tyr Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn 
2400 2405 2410 

GCT TTG TTG GAA GAG ACG ATT ACG GCA AAA GGT GTT CGC TCG CCG GAA 
Ala Leu Leu Glu Glu Thr Ile Thr Ala Lys Gly Val Arg Ser Pro Glu 
2415 2420 2425 

GCT ATT CGT GGT CGT ATA CAG GGT ACT TGG CGC CAG AAG ACG GTA GAC 

Htn 9 Ar9 116 Gln Gly Thr Tr P Gln Thr ?al Asp 

2435 2440 

CTT CCC GCA GGT ACG AAA TAT GTT GCT TTC CGT CAC TTC CAA AGC ACC 
Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His i£e sS ?£ 

2445 2450 2455 2 460 

GAT ATG TTC TAC ATC GAC CTT GAT GAG GTT GAG ATC AAG GCC AAC GGC 
Asp Met Phe Tyr lie Asp Leu Asp Glu Val Glu lie Lys Ala ktn 

2465 2470 2475 y 

f GC ft* TTC ACG «* ACG TTC GAG TCT TCT ACT CAT GGA GAG 

Lys Arg Ala Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly GlS 
248 ° 2485 2490 

^ A 5! A T C ?* T GCC GAT GGC GAT GGT CAG GGT 

Asp C 
2505 

SS Itl ? !. G If? ~? ^ ^ ? TG ^ CTG ACA GCT CAT GGC 

Leu 1 
2520 



3793 



3841 



GCC TCA TGG AAG ACG ATC GAT GCA GAC GGT GAC GGG CAT GGC TGG AAG 3889 
Ala Ser Trp Lys^Thr lie Asp Ala JJP^ly Asp Gly His Gly Trp Lys 

CCT GGA AAT GCT CCC GGA ATC GCT GGC TAC AAT AGC AAT GGT TGT GTA 
Pro Gly Asn Ala Pro Gly lie Ala Gly Tyr Asn Ser Asn Gly Cys Val 
2335 2340 2345 



3937 



3985 



4033 



4081 



4129 



4177 



4225 



4273 



4321 



4369 



4417 



4465 



nl " "Z ™ r?" * w ^ 1 ^ GAT GGC GAT GGT CAG GGT 

^95 P ^ 2500 ASP ** P ^?- Gly Gln ®S 

TGT CTG TCT TCC GGA CAA Tre GAC TGG CTG ACA GCT CAT GGC 
Trp Leu cys Leu Ser Ser Gly Gln Leu Asp Trp Leu Thr S 21 S£ 
,tf3 - LU 2515 2520 

GGC ACC AAC GTA GTA GCC TCT TTC TCA TGG AAT GGA ATG GCT TTG 
Gly Thr Asn Val Val Ala Ser Phe Ser Trp Asn SJ £? A?a 2! 55 
" 2530 2535 2 540 
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CCT GAT AAC TAT CTC ATC TCA AAG GAT GTT ACA GGC GCA ACG AAG GTA 4561 

Pro Asp Asn Tyr Leu lie Ser Lys Asp Val Thr Gly Ala Thr Lys Val 
2545 2550 2555 

AAG TAC TAC TAT GCA GTC AAC GAC GGT TTT CCC GGG GAT CAC TAT GCG 4609 
Lys Tyr Tyr Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala 
2560 2565 2570 

GTG ATG ATC TCC AAG ACG GGC ACG AAC GCC GGA GAC TTC ACG GTT GTT 4657 
Val Met He Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val 
2575 2580 2585 

TTC GAA GAA ACG CCT AAC GGA ATA AAT AAG GGC GGA GCA AGA TTC GGT 4705 
Phe Glu Glu Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly 
2590 2595 2600 

CTT TCC ACG GAA GCC AAT GGC GCC AAA CCT CAA AGT GTA TGG ATC GAG 4753 
Leu Ser Thr Glu Ala Asn Gly Ala Lys Pro Gin Ser Val Trp He Glu 
2605 2610 2615 2620 

CGT ACG GTA GAT TTG CCT GCG GGC ACG AAG TAT GTT GCT TTC CGT CAC 4801 
Arg Thr Val Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His 
2625 2630 2635 

TAC AAT TGC TCG GAT TTG AAC TAC ATT CTT TTG GAT GAT ATT CAG TTC 4849 
Tyr Asn Cys Ser Asp Leu Asn Tyr He Leu Leu Asp Asp He Gin Phe 
2640 2645 2650 

ACC ATG GGT GGC AGC CCC ACC CCG ACC GAT TAT ACC TAC ACG GTG TAT 4897 
Thr Met Gly Gly Ser Pro Thr Pro Thr Asp Tyr Thr Tyr Thr Val Tvr 
2655 2660 2665 

CGT GAC GGT ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC TTC GAA 4945 
Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu 
2670 2675 2680 

GAA GAC GGC GTA GCT ACA GGC AAT CAT GAG TAT TGC GTG GAA GTG AAG 4993 
Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys Val Glu Val Lvs 
2685 2690 2695 2700 

TAC ACA GCC GGC GTA TCT CCG AAA GAG TGC GTA AAC GTA ACT ATT AAT 5041 
Tyr Thr Ala Gly Val Ser Pro Lys Glu Cys Val Asn Val Thr He Asn 
2705 2710 2715 

CCG ACT CAG TTC AAT CCT GTA AAG AAC CTG AAG GCA CAA CCG GAT GGC 5089 
Pro Thr Gin Phe Asn Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly 
2720 2725 2730 

GGC GAC GTG GTT CTC AAG TGG GAA GCC CCG AGC GCA AAA AAG ACA GAA 5137 
Gly Asp Val Val Leu Lys Trp Glu Ala Pro Ser Ala Lys Lys Thr Glu 
2735 2740 2745 

GGT TCT CGT GAA GTA AAA CGG ATC GGA GAC GGT CTT TTC GTT ACG ATC 518 5 

Gly fSf*** 9 G1U Val Lys Ile GX y G1 V Phe Val Thr He 

2750 2755 2760 

GAA CCT GCA AAC GAT GTA CGT GCC AAC GAA GCC AAG GTT GTG CTC GCA 5233 
Glu Pro Ala Asn Asp Val Arg Ala Asn Glu Ala Lys Val Val Leu Ala 
2765 2770 2775 2780 

GCA GAC AAC GTA TGG GGA GAC AAT ACG GGT TAC CAG TTC TTG TTG GAT 5281 
Ala Asp Asn Val Trp Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp 
2785 2790 2795 
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GCC GAT CAC AAT ACA TTC GGA AGT GTC ATT CCG GCA ACC GGT CCT CTC 5329 
Ala Asp His Asn Thr Phe Gly Ser Val lie Pro Ala Thr Gly Pro Leu 
2800 2805 2810 

TTT ACC GGA ACA GCT TCT TCC AAT CTT TAC AGT GCG AAC TTC GAG TAT 5377 
Phe Thr Gly Thr Ala Ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr 
2815 2820 2825 

TTG ATC CCG GCC AAT GCC GAT CCT GTT GTT ACT ACA CAG AAT ATT ATC 5425 
Leu lie Pro Ala Asn Ala Asp Pro Val Val Thr Thr Gin Asn lie lie 
2830 2835 2840 

GTT ACA GGA CAG GGT GAA GTT GTA ATC CCC GGT GGT GTT TAC GAC TAT 5473 
Val Thr Gly Gin Gly Glu Val Val He Pro Gly Gly Val Tyr Asp Tyr 
2845 2850 2855 2860 

TGC ATT ACG AAC CCG GAA CCT GCA TCC GGA AAG ATG TGG ATC GCA GGA 5521 
Cys He Thr Asn Pro Glu Pro Ala Ser Gly Lys Met Trp lie Ala Gly 
2865 2870 2875 

GAT GGA GGC AAC CAG CCT GCA CGT TAT GAC GAT TTC ACA TTC GAA GCA 5569 
Asp Gly Gly Asn Gin Pro Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala 
2880 2885 2890 

GGC AAG AAG TAC ACC TTC ACG ATG CGT CGC GCC GGA ATG GGA GAT GGA 5617 
Gly Lys Lys Tyr Thr Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly 
2895 2900 2905 

ACT GAT ATG GAA GTC GAA GAC GAT TCA CCT GCA AGC TAT ACC TAT ACA 5665 
Thr Asp Met Glu Val Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr 
2910 2915 2920 

GTC TAT CGT GAC GGC ACG AAG ATC AAG GAA GGT CTG ACC GAA ACG ACC 5713 
Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Glu Thr Thr 
2925 2930 2935 2940 

TAC CGC GAT GCA GGA ATG AGT GCA CAA TCT CAT GAG TAT TGC GTA GAG 5761 
Tyr Arg Asp Ala Gly Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu 
2945 2950 2955 



GTT AAG TAC GCA GCC GGC GTA TCT CCG AAG GTT TGT GTG GAT TAT ATT 
Val Lys Tyr Ala Ala Gly Val Ser Pro Lys Val Cys Val Asp Tyr lie 
2960 2965 2970 



5809 



CCT GAC GGA GTG GCA GAC GTA ACG GCT CAG AAG CCT TAC ACG CTG ACA 
Pro Asp Gly Val Ala Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr 
2975 2980 2985 



5857 



GTT GTT GGA AAG ACG ATC ACG GTA ACT TGC CAA GGC GAA GCT ATG ATC 5905 
Val Val Gly Lys Thr He Thr Val Thr Cys Gin Gly Glu Ala Met He 
2990 2995 3000 

TAC GAC ATG AAC GGT CGT CGT CTG GCA GCC GGT CGC AAC ACA GTT GTT 5953 
Tyr Asp Met Asn Gly Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val 
3005 3010 3015 3020 



TAC ACG GCT CAG GGC GGC TAC TAT GCA GTC ATG GTT GTC GTT GAC GGC 
Tyr Thr Ala Gin Gly Gly Tyr Tyr Ala Val Met Val Val Val Asp Gly 
3025 3030 3035 



6001 



AAG TCT TAC GTA GAG AAA CTC GCT GTA AAG TAATTCTGTC TTGGACTCGG 
Lys Ser Tyr Val Glu Lys Leu Ala Val Lys 
3040 3045 



6051 
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AGACTTTGTG 


> CAGACACTTT 


TAATATAGGT 


CTGTAATTGT 


' CT CAGAGT AT 


GAATCGATCG 


6111 


CCCGACCTCC 


TTT TAAGGAA 


> GTCGGGCGAC 


TTCGTTTTTA TGCCTATTAT 


TCTAATATAC 


6171 


TTCTGAAACA ATTTGTTCCA AAAAGTTGCA TGAAAAGATT ATCTTACTAT 


CTTTGCACTG 


6231 




GTTTCCTAAG 


GTTTTCCCCG 


GAGTAGTACG 


GTAATAACGG 


TGTGGTAGTT 


6291 






CTGTCAC GCA 


GGGGGTCGCG 


GGTTCGAGTC 


CCGTCCATAC 


6351 






TTGAAATAGT 


TTTTCGGCAC 


AGCTCCATTT 


TTGTAT GTTA 


6411 




T\ T\ ^ /^'P TV fp TV 


ATTGC CGGAT 


GAGATTATT C 


AATATGCTCG 


GAAGATTTTC 


6471 




GT 


TT GTCTTTAT 


TACGATCTGC 


TTGGGACATA 


GGGATTAAAT 


6531 




GCAGGAGGGA 


CGGTACATGG 


AGTCGCCCGG 


CCAATCAGAT 


GAAGAAAGAA 


6591 




T GATTTTTAT 


GGGAACGGCC 


GATTTTGCTG 


TTCCGGCACT 


CCGAGCTTTG 


6651 




GATACCAAGT 


AAAAGCTGTG 


GTCACTATGC 


CGGACAAGCC 


TATGGGTCGA 


6711 


GGACATAAGG 


TAAGTCCCAG 


TATGGTCAAA 


CTATACGCAC 


AGGAATT GGG 


TCTGCCTATT 


6771 


CT CCAGC CGG 


ACAATCTGAA 


CGAGGAATCT 


TTT CTCGAT G 


AACTACGGAC 


TTAT CAGCCG 


6831 


CACTT GCAAA 


TCGTAGTGGC 


TTTCCGTATG 


CTTCCTCGCT 


CCGTAT GGCA 


AATGCCCCCC 


6691 


ATGGGAACAA 


TCAATCTGCA 


TGGCTCTCTG 


CTGCCCATGT 


ATCGAGGAGC 


AGCCCCTATC 


6951 


AACCACGCGA 


TACGCCATGG 


CGATACGGAA AC GGGAGTTA 


CCACCTTCCG 


CCTCCGGCAT 


7011 


GAGATAGATA 


CGGGTGAAGT 


ACTGCTGCAA 


GAGAAGTTGC 


CTATAGGACA 


TGAAGAGACT 


7071 


TTCGGCGAAT 


TGTACGAACG 


TATGGCTACT 


CTCGGTGCAT 


CCGTATTGGT 


GCACACAGTG 


7131 


GACTTGTTTC 


TCGAAGGAGA 


ACCCGTCTCC 


ATACCACAGG 


AGCAACTTCC 


GGGCTATGTT 


7191 


GGTGCTCGAC 


CGGCTCCGAA 


GATTTTCAAA 


GACGACTGCC 


GTATCGATTG 


GGACAAACCG 


7251 


GCTGAAGAGG 


TACACAATTT 


CATCCGCAGC 


ATATCGCCTG 


CCCCTACAGC 


TTGGACCAAG 


7311 


CTTCATCGTC 


CAGGGATGGA 


GTCCATCGTG 


CTGAAAATAT 


ACCGTACCCA 


AGT GATAGAA 


7371 


CGAGAACCGC 


GACACAGAGG 


CCGATTCGGC 


TC CAT CAT AT 


GGGACAAGAA 


AAACCTCGAC 


7431 


GTGATGACCC 


GCAAAGGGGT 


CAT AC GT AT A 


CTCTCGCTCC 


AAATGCCCGG 


CAAGAAACAA 


7491 


ATGGATGCTG 


CCTCTTTCCT 


CAATGGTTTC 


GCTTTGTCCT 


CAGAT AT GT A 


TATAGAATAG 


7551 


GAGAGAGCTT 


GTTCCAAGGT 


TTGAACTGCT 


CATTTTCTGA 


CCTCTTGCAT 


ACAATAACAA 


7611 


TGTGCGGAGG 


ATACTTCTCT 


GCTCAACGTT 


CAGAGAAGCA 


GTTGGTCGTA 


GGCCGAGCCA 


7671 


ATCACATGGT 


TTTTTAACTT 


TGTAACCGAA 


AAATAAACGA 


CATCATGAAA 


GAAAACGAAA 


7731 


AGCCGACAGC 


TGCTGCCGGA 


AC C GTAACCA 


CCACCGATAA 


GACAAAGC CT 


GATTGGCGCA 


7791 


AAATCCTACC 


TTAT GCTGCG 


GTCGTACTCC 


TTTT CATAGC 


CCTCGCTTTG 


GCCTATTTCT 


7851 


ATCCCGCCTC . 


ATTCGACGGG 


C GT GT ACT GT 


TCCAGGGCGA 


CGTAGC GGGA 


GCCAGCGGTA 


7911 


CGGCGCAGGA 


CGTACGC GAT 


TGGGAGGCAC , 


AGACAGGAGA ACACTCCTAT 


TGGACCAACA 


7971 
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GTCTCTTCGG 


GGGGATGCCT 


AT GTACCAGA 


TTTCGCCAAG 


CTATCCCTCT 


ACCCATACGC 


8031 


TCCAAACCAT 


ACAGGATGTT 


CTGACCCTGC 


GCAAGC CTTT 


CTATCTATTA 


GGCACCTATG 


8091 


CCTGGATGCT 


TTTTGCCATG 


ATGGGAGGGT 


TCTTT CTTTT 


CCTTAGATCG 


CTTCGAATCA 


8151 


GGATTTTGCC 


GGCAGT CAT A 


GGCTCCATCG 


CAT GGGC CTT 


TTCTTCCTAC 


TTCCTGATTC 


8211 


TGATTAT GGC 


CGGACATATA 


TGGAAGCTGA 


CAGCTAT GTG 


TTTTATTCCT 


CCTACTCTTG 


8271 


CCGGTATGAT 


CTGGATCTAC 


AATGGGAGGT 


GGTTGGCAGG 


CGGTAGCGTG 


ATGGCTTTTT 


8331 


TCACGGCTTT 


GCAAGTCTTG 


GCTAATCATG 


T ACAGAT GAG 


CTATTACTTC 


CTGTTCGTCA 


8391 


TGTTTTTCAT 


GGTGTTGGCT 


TTCTTGGCAG 


AAGCCATTCA 


AACAAAACGA 


ATCCGACACT 


8451 


TCTTCCTTTC 


CTCGGCAGTA 


GTCGTCATAG 


CAGGTCTGGT 


GGGTATAGCT 


GTGAATAGTA 


8511 


CCAACCTCTT 


CCACACCTAC 


CAATACGGCA 


AAGAGAC CAT 


GCGTGGAGGT 


AGCGAACTGA 


8571 


CGCTGAAGCA 


GAGCGGAGCA 


CCCACGGATC 


AAGTGACGCA 


TGAGAATAAA 


AGCGGACTGG 


8631 


ACAAGGCCT 












8640 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1687 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Ala Phe Ala Gin Gin Thr Glu Leu Gly Arg Asn Pro Asn Val Arg 
15 10 15 

Leu Leu Glu Ser Thr Gin Gin Ser Val Thr Lys Val Gin Phe Arg Met 
20 25 30 

Asp Asn Leu Lys Phe Thr Glu Val Gin Thr Pro Lys Gly Met Ala Gin 
35 40 45 

Val Pro Thr Tyr Thr Glu Gly Val Asn Leu Ser Glu Lys Gly Met Pro 
50 55 60 

Thr Leu Pro lie Leu Ser Arg Ser Leu Ala Val Ser Asp Thr Arg Glu 
65 70 75 80 

Met Lys Val Glu Val Val Ser Ser Lys Phe lie Glu Lys Lys Asn Val 

85 90 95 

Leu lie Ala Pro Ser Lys Gly Met lie Met Arg Asn Glu Asp Pro Lys 
100 105 no 

Lys lie Pro Tyr Val Tyr Gly Lys Ser Tyr Ser Gin Asn Lys Phe Phe 
115 120 125 

Pro Gly Glu lie Ala Thr Leu Asp Asp Pro Phe lie Leu Arg Asp Val 
130 135 140 



9617936A2_L> 



WO 96/17936 



PCT/US95/16108 



139 



Arg Gly Gin Val Val Asn Phe Ala Pro Leu Gin Tyr Asn Pro Val Thr 
145 150 155 160 

Lys Thr Leu Arg lie Tyr Thr Glu lie Thr Val Ala Val Ser Glu Thr 
165 170 175 

Ser Glu Gin Gly Lys Asn lie Leu Asn Lys Lys Gly Thr Phe Ala Gly 
180 185 190 

Phe Glu Asp Thr Tyr Lys Arg Met Phe Met Asn Tyr Glu Pro Gly Arg 
195 200 205 

Tyr Thr Pro Val Glu Glu Lys Gin Asn Gly Arg Met He Val He Val 
210 215 220 

Ala Lys Lys Tyr Glu Gly Asp He Lys Asp Phe Val Asp Trp Lys Asn 
225 230 235 240 

Gin Arg Gly Leu Arg Thr Glu Val Lys Val Ala Glu Asp lie Ala Ser 
245 250 255 

Pro Val Thr Ala Asn Ala He Gin Gin Phe Val Lys Gin Glu Tyr Glu 
260 265 270 

Lys Glu Gly Asn Asp Leu Thr Tyr Val Leu Leu Val Gly Asp His Lys 
275 280 285 

Asp He Pro Ala Lys He Thr Pro Gly He Lys Ser Asp Gin Val Tyr 
290 295 300 

Gly Gin He Val Gly Asn Asp His Tyr Asn Glu Val Phe He Gly Arg 
305 310 315 320 

Phe Ser Cys Glu Ser Lys Glu Asp Leu Lys Thr Gin He Asp Arg Thr 
325 330 335 

He His Tyr Glu Arg Asn He Thr Thr Glu Asp Lys Trp Leu Gly Gin 
340 345 350 

Ala Leu Cys He Ala Ser Ala Glu Gly Gly Pro Ser Ala Asp Asn Gly 
355 360 365 

Glu Ser Asp He Gin His Glu Asn Val He Ala Asn Leu Leu Thr Gin 
370 375 380 

Tyr Gly Tyr Thr Lys He He Lys Cys Tyr Asp Pro Gly Val Thr Pro 
385 390 395 400 

Lys Asn lie He Asp Ala Phe Asn Gly Gly He Ser Leu Val Asn Tyr 
405 410 415 

Thr Gly His Gly Ser Glu Thr Ala Trp Gly Thr Ser His Phe Gly Thr 
420 425 430 

Thr His Val Lys Gin Leu Thr Asn Ser Asn Gin Leu Pro Phe He Phe 
435 440 445 

Asp Val Ala Cys Val Asn Gly Asp Phe Leu Phe Ser Met Pro Cys Phe 
450 455 460 

Ala Glu Ala Leu Met Arg Ala Gin Lys Asp Gly Lys Pro Thr Gly Thr 



465 



470 



475 



480 
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Val Ala lie lie Ala Ser Thr lie Asn Gin Ser Trp Ala Ser Pro Met 
485 490 495 

Arg Gly Gin Asp Glu Met Asn Glu lie Leu Cys Glu Lys His Pro Asn 
500 505 510 

Asn lie Lys Arg Thr Phe Gly Gly Val Thr Met Asn Gly Met Phe Ala 
515 520 525 

Met Val Glu Lys Tyr Lys Lys Asp Gly Glu Lys Met Leu Asp Thr Trp 
530 535 540 

Thr Val Phe Gly Asp Pro Ser Leu Leu Val Arg Thr Leu Val Pro Thr 
545 550 555 560 

Lys Met Gin Val Thr Ala Pro Ala Gin lie Asn Leu Thr Asp Ala Ser 
565 570 575 

Val Asn Val Ser Cys Asp Tyr Asn Gly Ala lie Ala Thr lie Ser Ala 
580 585 590 

Asn Gly Lys Met Phe Gly Ser Ala Val Val Glu Asn Gly Thr Ala Thr 
595 600 605 

lie Asn Leu Thr Gly Leu Thr Asn Glu Ser Thr Leu Thr Leu Thr Val 
610 615 620 

Val Gly Tyr Asn Lys Glu Thr Val lie Lys Thr lie Asn Thr Asn Gly 
625 630 635 640 

Glu Pro Asn Pro Tyr Gin Pro Val Ser Asn Leu Thr Ala Thr Thr Gin 
645 650 655 

Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Ser Thr Lys Thr Asn 
660 665 670 

Ala Thr Thr Asn Thr Ala Arg Ser Val Asp Gly He Arg Glu Leu Val 
675 680 685 

Leu Leu Ser Val Ser Asp Ala Pro Glu Leu Leu Arg Ser Gly Gin Ala 
690 695 700 

Glu He Val Leu Glu Ala His Asp Val Trp Asn Asp Gly Ser Gly Tyr 
705 710 715 720 

Gin He Leu Leu Asp Ala Asp His Asp Gin Tyr Gly Gin Val lie Pro 
725 730 735 

Ser Asp Thr His Thr Leu Trp Pro Asn Cys Ser Val Pro Ala Asn Leu 
740 745 750 

Phe Ala Pro Phe Glu Tyr Thr Val Pro Glu Asn Ala Asp Pro Ser Cys 
755 760 765 

Ser Pro Thr Asn Met He Met Asp Gly Thr Ala Ser Val Asn He Pro 
770 775 780 

Ala Gly Thr Tyr Asp Phe Ala He Ala Ala Pro Gin Ala Asn Ala Lys 
785 790 795 800 

He Trp He Ala Gly Gin Gly Pro Thr Lys Glu Asp Asp Tyr Val Phe 
805 810 815 
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Glu Ala Gly Lys Lys Tyr His Phe Leu Met Lys Lys Met Gly Ser Gly 
820 825 830 

Asp Gly Thr Glu Leu Thr He Ser Glu Gly Gly Gly Ser Asp Tyr Thr 
835 840 845 

Tyr Thr Val Tyr Arg Asp Gly Thr Lys He Lys Glu Gly Leu Thr Ala 
850 855 860 

Thr Thr Phe Glu Glu Asp Gly Val Ala Thr Gly Asn His Glu Tyr Cys 
865 870 875 880 

Val Glu Val Lys Tyr Thr Ala Gly Val Ser Pro Lys Val Cys Lys Asp 
885 890 895 

Val Thr Val Glu Gly Ser Asn Glu Phe Ala Pro Val Gin Asn Leu Thr 
900 905 910 

Gly Ser Ala Val Gly Gin Lys Val Thr Leu Lys Trp Asp Ala Pro Asn 
915 920 925 

Gly Thr Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro Asn Pro Gly 
930 935 940 

Thr Thr Thr Leu Ser Glu Ser Phe Glu Asn Gly He Pro Ala Ser Trp 
945 950 955 960 

Lys Thr He Asp Ala Asp Gly Asp Gly His Gly Trp Lys Pro Gly Asn 
965 970 975 

Ala Pro Gly He Ala Gly Tyr Asn Ser Asn Gly Cys Val Tyr Ser Glu 
980 985 990 

Ser Phe Gly Leu Gly Gly He Gly Val Leu Thr Pro Asp Asn Tyr Leu 
995 1000 1005 

He Thr Pro Ala Leu Asp Leu Pro Asn Gly Gly Lys Leu Thr Phe Trp 
1010 1015 1020 

Val Cys Ala Gin Asp Ala Asn Tyr Ala Ser Glu His Tyr Ala Val Tyr 
1025 1030 1035 1040 

Ala Ser Ser Thr Gly Asn Asp Ala Ser Asn Phe Thr Asn Ala Leu Leu 
1045 1050 1055 

Glu Glu Thr He Thr Ala Lys Gly Val Arg Ser Pro Glu Ala He Arg 
1060 1065 1070 

Gly Arg He Gin Gly Thr Trp Arg Gin Lys Thr Val Asp Leu Pro Ala 
1075 1080 1085 

Gly Thr Lys Tyr Val Ala Phe Arg His Phe Gin Ser Thr Asp Met Phe 
1090 1095 1100 

Tyr He Asp Leu Asp Glu Val Glu He Lys Ala Asn Gly Lys Arg Ala 
1105 1110 1115 1120 

Asp Phe Thr Glu Thr Phe Glu Ser Ser Thr His Gly Glu Ala Pro Ala 
1125 1130 1135 

Glu Trp Thr Thr lie Asp Ala Asp Gly Asp Gly Gin Gly Trp Leu Cys 
1140 1145 1150 
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Leu Ser Ser Gly Gin Leu Asp Trp Leu Thr Ala His Gly Gly Thr Asn 

1155 H60 H65 

Val Val Ala Ser Phe Ser Trp Asn Gly Met Ala Leu Asn Pro Asp Asn 
1170 1175 iieo 

lias Leu Ile Ser Lys Asp val Thr Gly Ala Thr Lys Val L ^ s T y r T y* 

1190 1195 1200 

Tyr Ala Val Asn Asp Gly Phe Pro Gly Asp His Tyr Ala Val Met lie 
1205 1210 12 i5 

Ser Lys Thr Gly Thr Asn Ala Gly Asp Phe Thr Val Val Phe Glu Glu 
122 ° 1225 1230 

Thr Pro Asn Gly He Asn Lys Gly Gly Ala Arg Phe Gly Leu Ser Thr 
1235 1240 1245 

G1U ^tt^ Gly Ala LyS Pr ° Gln Ser Val Tr P Ile G1 « Arg Thr Val 
1230 1255 1260 

Asp Leu Pro Ala Gly Thr Lys Tyr Val Ala Phe Arg His Tyr Asn Cys 
1265 1270 1275 * i| 80 

Ser Asp Leu Asn Tyr Ile Leu Leu Asp Asp He Gin Phe Thr Met Gly 
1285 1290 1295 

Gly Ser Pro Thr^Pro Thr Asp Tyr Thr^Tyr Thr Val Tyr Arg Asp Gly 

Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Phe Glu Glu Asp Gly 
1315 1320 1325 

Val The Asn His Glu Cys Val Glu Val Lys Tyr Thr Ala 

J - 3 - 3U 1335 1340 

Gly val ser Pro Lys Glu Cys val Asn Val Thr lie Asn Pro Thr Gin 
45 1350 1355 1360 

Phe Asn Pro Val Lys Asn Leu Lys Ala Gin Pro Asp Gly Gly Asp Val 
1365 i3 7 o 13 ^ 5 

Val Leu Lys Trp Glu Ala Pro Ser Ala Lys Lys Thr Glu Gly Ser Arg 
1380 i3 8 5 13 * 0 9 

Glu Val Lys Arg He Gly Asp Gly Leu Phe Val Thr Ile Glu Pro Ala 
1395 1400 i 405 

Asn Asp val Arg Ala Asn Glu Ala Lys Val Val Leu Ala Ala Asp Asn 
1H1V 1415 1420 

Val Trp Gly Asp Asn Thr Gly Tyr Gin Phe Leu Leu Asp Ala Asp His 
J "*'" 1430 1435 i44 0 

Asn Thr Phe Gly Ser Val He Pro Ala Thr Gly Pro Leu Phe Thr Gly 
1445 i45 0 1455 y 

Thr Ala ser Ser Asn Leu Tyr Ser Ala Asn Phe Glu Tyr Leu He Pro 
xqbv 1465 1470 

Ala Asn Ala Asp Pro Val Val Thr Thr Gin Asn He lie Val Thr Gly 
14/5 1480 1485 * 
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Gin Gly Glu Val Val lie Pro Gly Gly Val Tyr Asp Tyr Cys lie Thr 
1490 1495 1500 

Asn Pro Glu Pro Ala Ser Gly Lys Met Trp lie Ala Gly Asp Gly Gly 
1505 1510 1515 1520 

Asn Gin Pro Ala Arg Tyr Asp Asp Phe Thr Phe Glu Ala Gly Lys Lys 
1525 1530 1535 

Tyr Thr Phe Thr Met Arg Arg Ala Gly Met Gly Asp Gly Thr Asp Met 
1540 1545 1550 

Glu Val Glu Asp Asp Ser Pro Ala Ser Tyr Thr Tyr Thr Val Tyr Arg 
1555 1560 1565 

Asp Gly Thr Lys lie Lys Glu Gly Leu Thr Glu Thr Thr Tyr Arg Asp 
1570 1575 1580 

Ala Gly Met Ser Ala Gin Ser His Glu Tyr Cys Val Glu Val Lys Tyr 
1585 1590 1595 1600 

Ala Ala Gly Val Ser Pro Lys Val Cys Val Asp Tyr lie Pro Asp Gly 
1605 1610 1615 

Val Ala Asp Val Thr Ala Gin Lys Pro Tyr Thr Leu Thr Val Val Gly 
1620 1625 1630 

Lys Thr lie Thr Val Thr Cys Gin Gly Glu Ala Met lie Tyr Asp Met 
1635 1640 1645 

Asn Gly Arg Arg Leu Ala Ala Gly Arg Asn Thr Val Val Tyr Thr Ala 
1650 1655 1660 

Gin Gly Gly Tyr Tyr Ala Val Met Val Val Val Asp Gly Lys Ser Tyr 
1665 1670 1675 1680 

Val Glu Lys Leu Ala Val Lys 
1685 
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Claims 

1 1 . A method for the detection of evidence of periodontal disease in human or animal tissue 

2 or fluid samples, said method comprising contacting said sample with a DNA probe wherein said 

3 probe comprises a detectable single-stranded DNA having a nucleotide sequence sufficiently 

4 homologous with the DNA of Porphyromonas gingivalis so that the DNA of the probe specifically 

5 and selectively hybridizes with the DNA of said bacteria for detection of said probe bound to said 

6 homologous DNA. 

1 2. The method, according to claim 1, wherein said DNA probe comprises a nucleotide 

2 sequence selected from the group consisting of SEQ ID NO. 1 1, SEQ ID NO. 15, SEQ ID NO. 1 7, 

3 SEQ ID NO. 19, SEQ ID NO. 21, and SEQ ID NO. 23, or a fragment of variant thereof, said 

4 fragment or variant having sufficient homology with said sequences to specifically and selectively 

5 hybridize thereto. 

1 3. The method, according to claim 1, wherein said DNA probe comprises a nucleotide 

2 sequence selected from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ 

3 ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 3, SEQ ID NO. 25, and SEQ ID NO. 28, or a fragment or 

4 variant thereof. 

1 4. The method, according to claim 2, wherein said nucleotide sequence encodes a 

2 polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO. 12, 

3 SEQ ID NO. 16, SEQ ID NO. 18, SEQ ID NO. 20, SEQ ID NO. 22, and SEQ ID NO. 24, or a 

4 fragment or variant thereof. 

1 5. A Porphyromonas gingivalis gene encoding a polypeptide, said polypeptide having an 

2 amino acid sequence selected from the group consisting of SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID 

3 NO. 6, SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 1 8, SEQ 

4 ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 26, SEQ ID NO. 27, and SEQ ID NO. 29, or a fragment 

5 or variant thereof. 

1 6. The gene, according to claim 3, said gene comprising the nucleotide sequence selected 

2 from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, SEQ 
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3 ID NO. 9, SEQ ID NO. 13,SEQIDNO. IS, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, 

4 SEQ ID NO. 25, and SEQ ID NO. 28, or a fragment or variant thereof. 

1 7. A host cell transformed with a Porphyromonas gingival} s gene which encodes a 

2 Porphyromonas gingivalis antigen, said gene selected from the group consisting of SEQ ID NO. 

3 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 13, SEQ ID NO. 

4 15, SEQ ID NO. 17, SEQ ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 25, and SEQ ID NO. 28, or a 

5 fragment or variant thereof. 

1 8. The recombinant cell, according to claim 7, which has all of the identifying characteristics 

2 of ATCC 67733. 

1 9. The recombinant cell, according to claim 7, which has all the identifying characteristics 

2 of ATCC 67734. 

1 10. A polypeptide wherein said polypeptide has an amino acid sequence selected from the 

2 group consisting of SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 

3 10, SEQ ID NO. 12, SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 1 8, SEQ ID NO. 20, SEQ ID 

4 NO. 22, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID NO. 27, and SEQ ID NO. 29, or a fragment or 

5 variant thereof. 

1 1 1. A method for detecting the presence of anti-Porphyromonas gingivalis antibodies in 

2 a biological fluid sample, said method comprising 

3 (a) contacting the sample with whole transformed host cell or cell lysate, wherein said 

4 cell expresses Porphyromonas gingrvalis-spccific antigens, said contacting done 

5 under conditions compatible with specific antigen/antibody immunocomplex 

6 formation between said expressed antigens and antibodies present in the sample; 

7 and 

8 (b) detecting immunocomplex formation by means of a label to thereby detect the 

9 presence of Porphyromonas gingivalis antibodies in the sample. 

1 12. The method, according to claim 1 1, wherein said Porphyromonas gingivalis antigen 

2 expressed by the host cell or cell lysate is a polypeptide having the amino acid sequence selected 

3 from the group consisting of SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ 
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4 IDNO. 10.SEQIDNO. 12,SEQIDNO. 14.SEQIDNO. 16.SEQIDNO. 18, SEQ IDNO. 20, 

5 SEQ ID NO. 22, SEQ ID NO. 24, SEQ ID NO. 26, SEQ ID NO. 27, and SEQ ID NO. 29, or a 

6 fragment or variant thereof. 



1 
2 
3 
4 



1 
2 
3 
4 
5 
6 



13. The method, according to claim 1 1, wherein said Porphyromonas gingivals antigen 
is encoded by a nucleic acid comprising a nucleotide sequence selected from the group consisting of 
SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 1 1, 
SEQ ID NO. 13.SEQIDNO. 15, SEQ IDNO. 17.SEQIDNO. 19, SEQ ID NO. 21, SEQ ID No! 



5 23, SEQ ID NO. 25, and SEQ ID NO. 28, or a fragment or variant thereof. 



14. A method for vaccinating a susceptible human or animal host to confer immunity to 
periodontal disease, said method comprising administering an immunizing amount of a transformed 
host cell or cell rysate, or a product of a transformed host cell, wherein said cell has been transformed 
with a DNA fragment which encodes an amino acid sequence selected from the group consisting of 
SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 1 0, SEQ ID NO. 12, 
SEQ ID NO. 14, SEQ ID NO. 16, SEQ ID NO. 18, SEQ IDNO. 20, SEQ IDNO. 22, SEQ ID No! 
24, SEQ ID NO. 26, SEQ ID NO. 27, and SEQ ID NO. 29, or a fragment or variant thereof 



15. The method, according to claim 14, wherein said DNA fragment has the nucleotide 
sequence selected from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ 
ID NO. 7, SEQ ID NO. 9, SEQ ID NO. 11, SEQ ID NO. 13, SEQ IDNO. 15,SEQIDNO. 17,SEQ 
ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 23, SEQ ID NO. 25, and SEQ ID NO. 28, or a fragment 
5 or variant thereof. 



1 
2 



1 6. A vaccine for conferring immunity to periodontal disease on a susceptible human or 
animal host, said vaccine comprising an immunizing amount of a DNA sequence, a host cell 
transformed with said DNA sequence, or a product or rysate of said transformed host cell, wherein 
said DNA sequence encodes an amino acid sequence selected from the group consisting of SEQ ID 
NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8 SEQ ID NO. 10, SEQ ID NO. 14, SEQ ID 
NO. 1 6, SEQ ID NO. 1 8, SEQ ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 24, and SEQ ID NO. 29, 
or a fragment or variant thereof. 

1 7. The vaccine, according to claim 1 6, wherein said DNA sequence is sequence selected 
from the group consisting of SEQ ID NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, SEQ ID NO. 7, SEQ 
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3 ID NO. 9, SEQ ID NO. 13, SEQ ID NO. 15, SEQ ID NO. 17,SEQIDNO. 19, SEQ ID NO. 21, 

4 SEQ ID NO. 25, and SEQ ID NO. 28, or a fragment or variant thereof. 

1 1 8. The vacc ine , according to claim 16, wherein said transformed host cells are Salmonella, 

1 1 9. A monoclonal antibody reagent useful in determining the presence of a periodontal 

2 pathogen, said reagent comprising at least one monoclonal antibody species-specific to 

3 Porphyromonas gingivals, wherein said monoclonal antibody specifically and selectively binds to 

4 a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO. 2, 

5 SEQ ID NO. 4, SEQ ID NO. 6, SEQ ID NO. 8, SEQ ID NO. 10, SEQ ID NO. 12,SEQIDNO. 14, 

6 SEQ ID NO. 16, SEQ ID NO. 18, SEQ ID NO. 20, SEQ ID NO. 22, SEQ ID NO. 24, SEQ ID NO. 

7 26, SEQ ID NO. 27, and SEQ ID NO. 29, or a fragment or variant thereof. 

1 20. A kit for detecting evidence of periodontal disease, wherein said kit comprises a 

2 Porphyromonas gingivalis -specific component selected from the group consisting of the hagA y 

3 hagB y hagC y hagD gene, cm* prrP, a polypeptide product of said gene, and an antibody to said 

4 polypeptide gene product. 
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1. Claims 1-7 10-20 partially: HagD, gene encoding the polypeptide , 
antibodies against the hemagglutinin and use of the gene or fragments 
thereof or of cells expressing the gene in methods for the detection of 
Porhyromonas gingivalis or antibodies against the microorganism or as 
vaccine. 

2. Claims 1-7 10-20 partially: PrtP, gene encoding the polypeptide , 
antibodies against the protease and use of the gene or fragments thereof or 
of cells expressing the gene in methods for the detection of Porphyromonas 
gingivalis or antibodies against the microorganism or as vaccine. 

3. Claims 1-7 10-19 partially: HagE, gene encoding the polypeptide , 
antibodies against the protease and use of the gene or fragments thereof or 
of cells expressing the gene in methods for the detection of Porphyromonas 
gingivalis or antibodies against the microorganism or as vaccine. 

4. Claims 8 completely, 1-7, 10-20 partially: HagA, gene encoding the 
polypeptide , antibodies against the hemagglutinin and use of the gene or 
fragments thereof or of cells expressing the gene in methods for the detection 
of Porphyromonas gingivalis or antibodies against the microorganism or as 
vaccine. 

5. Claims 9 completely, 1 3 5-7 10-20 partially: HagB, gene encoding the 
polypeptide , antibodies against the hemagglutinin and use of the gene or 
fragments thereof or of cells expressing the gene in methods for the detection 
of Porphyromonas gingivalis or antibodies against the microorganism or as 
vaccine. 

6. Claims 1 3 5-7 10-20 partially: HagC, gene encoding the polypeptide , 
antibodies against the hemagglutinin and use of the gene or fragments 
thereof or of cells expressing the gene in methods for the detection of 
Porphyromonas gingivalis or antibodies against the microorganism or as 
vaccine. 
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principle, due to the fact that some of the polypeptides have already been 
disclosed, further separate inventions. However taking into account the 
balance between necessary search effort and the levying of additional fees, 
the ISA has regrouped the different claimed inventions on the basis of the 6 
different proteins. 
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